Meta AI Leads Data-Hungry Chatbot Rankings, Collecting 32 of 35 User Data Points

Meta AI has been identified as the most aggressive collector of user data among popular AI chatbots, harvesting 32 out of 35 possible data points including contact details, financial information, and precise location. The finding comes from a new SC Media analysis that evaluated the data collection practices of widely used AI assistants, placing Google Gemini closely behind Meta and raising fresh alarms about the privacy implications of conversational AI.

Microsoft Copilot, Poe, Claude AI, and DeepSeek also made the list, with many services not only hoarding sensitive data but also sharing it with advertisers and data brokers. The report underscores a stark reality: the convenience of AI chatbots often comes at the cost of extensive personal data exploitation.

Meta AI: The Most Data-Hungry Chatbot

SC Media’s analysis examined 35 distinct data categories that chatbots can collect from users. Meta AI topped the chart by gathering 32 of those, far outpacing competitors. The data haul includes highly sensitive identifiers: contact details like phone numbers and email addresses, financial data, user-generated content, usage patterns, diagnostics, and granular location data.

This aggregation allows Meta to build deep profiles on individuals, linking chatbot interactions to existing social media accounts and advertising identifiers. The company has long faced scrutiny over its data practices, and the integration of AI chat into Facebook, Instagram, and WhatsApp provides yet another funnel for personal information.

Meta AI’s privacy labels on the Apple App Store confirm it may collect data linked to your identity for third-party advertising. That means your conversations with the assistant aren’t just analyzed for service improvement—they can directly feed Meta’s lucrative ad-targeting machine.

Google Gemini and Other Major Players

Google Gemini ranks second in data appetite. While specific figures weren’t broken out in the same 32-of-35 manner, the report places it alongside Meta as a heavy collector. Gemini taps into Google’s massive ecosystem, potentially linking chats to Gmail, Search, Maps, and YouTube data. Its privacy policy notes that AI interactions may be stored and reviewed by humans to improve the product, a practice that has drawn criticism from privacy advocates.

Poe, the chatbot aggregator from Quora, collects extensive data and shares device IDs with data brokers. Claude AI, developed by Anthropic, collects far less raw data but still retains certain usage information. Microsoft Copilot, deeply integrated into Windows and the Edge browser, also mines user identity-linked information for third-party ads and passes device IDs to brokers.

Jasper, a marketing-focused AI tool, rounds out the list of aggressive data harvesters. Its business model relies on tailoring outputs for marketers, which inherently requires deeper data analysis—but the report indicates it also engages in user tracking.

The Data Broker Connection and Third-Party Advertising

A critical finding is that several chatbots—Poe, Copilot, and Jasper—share unique device identifiers with data brokers. This creates a backdoor for user tracking across apps and websites, even if the chatbot itself doesn’t directly sell personal details. Data brokers can splice together seemingly anonymous device IDs with other data sources to reconstruct detailed user profiles.

Both Meta AI and Microsoft Copilot leverage user identity-linked information for third-party advertising. In practice, that means a conversation with Copilot about a new laptop could result in ads for tech products appearing elsewhere—a clear indication that AI chats aren’t private spaces but rather data-generating interactions.

DeepSeek: Prolonged Data Storage and Server Risks

Chinese AI platform DeepSeek ranked sixth in raw data points collected, but its storage practices ignited separate warnings. The company keeps user chats on servers for extended periods, with servers located in China—raising jurisdictional and breach risks. Researchers note that any stored data is perpetually vulnerable to hacking, insider threats, or legal requests.

The combination of undefined retention periods and geopolitical tensions makes DeepSeek particularly concerning for users outside China. Even if the platform’s current collection scope is narrower, the lack of clear data deletion policies and the constant risk of server compromises mean past conversations could resurface.

What Data Are Chatbots Actually Collecting?

To understand the privacy threat, it helps to dissect the 35 data points used in the analysis. These span several categories:

Identity-linked data: Name, email, phone number, physical address, user ID, device ID.
Financial data: Payment info, credit score, purchase history.
Health data: Fitness, medical history, health-related searches.
Location data: Precise location, coarse location.
Content: Photos, videos, audio recordings, chat content.
Usage data: Product interaction, advertising data, crash logs.

Most chatbots collect at least some of these. Meta AI’s 32 categories mean nearly everything but highly regulated health data is on the table. The sheer breadth transforms a casual Q&A tool into a surveillance mechanism.

Privacy Implications for Users

The extensive data collection poses several concrete risks:

Targeted manipulation: Rich behavioral profiles enable hyper-personalized ads that can exploit emotional states or financial vulnerabilities.
Security breaches: Centralized databases of chat logs—often containing confidential information—become juicy targets for attackers. In 2023, a ChatGPT bug leaked users’ chat histories, proving no platform is immune.
Surveillance and profiling: Governments and corporations could access stored data for monitoring, especially when servers reside in jurisdictions with weak privacy protections.
Loss of anonymity: Even when chats feel ephemeral, they can be tied back to real identities through device IDs and account linkages.

The Transparency Gap

Most users are unaware of how their chatbot interactions are harvested. Privacy policies are often buried in lengthy legalese, and default settings maximize data collection. The report highlights that Apple’s App Store privacy labels—while imperfect—are one of the few easily accessible signals. Yet many users never check them.

Microsoft Copilot’s integration into Windows presents a unique challenge. As of Windows 11’s 23H2 update, Copilot lives in the taskbar, constantly accessible. But many users don’t realize that their queries can be used for ad targeting. The edge between a helpful assistant and a data pipeline blurs.

What Can Users Do?

For those concerned about privacy, immediate steps include:

Review privacy settings: Services like Meta AI and Google Gemini allow users to delete conversation history and opt out of data sharing for ad purposes—though these options are often hidden.
Use privacy-focused alternatives: Encrypted chatbots like Signal’s AI or offline models like GPT4All minimize data collection.
Avoid sharing sensitive information: Never input passwords, health details, or financial data into a chatbot.
Check App Privacy Labels: On iOS, scroll to App Privacy on the app’s store page. Look for “Data Linked to You” under advertising.
Access web versions without login: Some chatbots allow anonymous use via web without sign‑in, though features may be limited.

The Regulatory Landscape

Global privacy regulations like GDPR and the California Consumer Privacy Act give users rights to access and delete their data. However, enforcement against AI‑specific data collection remains nascent. The European Data Protection Board has flagged chatbots for potential violations, and the U.S. Federal Trade Commission has warned companies against silently training AI on user data.

Yet penalties are rare, and data collection practices often continue unchanged. SC Media’s findings could fuel renewed calls for stronger AI privacy laws, particularly around mandatory retention limits and clear opt‑in consent for advertising use.

Industry Response

Following the report, representatives for Meta and Google pointed to existing privacy controls. A Meta spokesperson stated, “We give people control over their data through clear settings and off‑Facebook activity tools.” Microsoft noted that Copilot for individuals does not use enterprise chat data for ads—but the consumer version does.

These reassurances often shift responsibility onto users to navigate complex settings. Privacy advocates argue that true protection requires data minimization by design, not opt‑outs buried in menus.

Looking Ahead

The SC Media analysis arrives as AI chatbots become embedded in operating systems, productivity suites, and messaging apps. Without intervention, the data collection will only deepen. Developers face a choice: monetize through data exploitation or build sustainable, privacy‑respecting business models.

Users, too, must weigh convenience against privacy. As digital assistants grow more capable, they demand more data to function—but the current landscape reveals that many companies are taking far more than they need. The question remains: will the industry self‑regulate, or will lawmakers force their hand?