By: Press Services
March 25, 2026

AI Platforms Found to Have 0% Data Leakage in New Study

Study Assures Users: AI Platforms Do Not Leak Sensitive Data

New York, United States - March 21, 2026 / Search Atlas /

NEW YORK CITY, NY, March 19, 2026 - Search Atlas, a prominent SEO and digital intelligence platform, has today unveiled the results of a controlled study investigating the fate of sensitive information entered into leading AI platforms. This research assessed six major large language models (LLMs)-OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode-through two carefully designed experiments aimed at replicating worst-case scenarios of data exposure.

The findings provide significant reassurance for both businesses and individuals worried about the confidentiality of information shared with AI tools. Across all six platforms evaluated, the researchers discovered a complete absence of data leakage concerning user-provided sensitive information.

The comprehensive study can be accessed here.

Key Findings:

LLMs do not retain or replay sensitive user information (0% data leakage across all platforms evaluated)

Retrieved facts disappear when search is disabled (no signs of short-term retention or leakage)

Users face AI hallucinations, not data exposure

Conducted by researchers at Search Atlas, the study scrutinized six prominent LLM platforms (OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode) through two carefully controlled experiments aimed at mimicking worst-case data exposure scenarios. The outcomes provide significant reassurance for businesses and individuals concerned about the handling of confidential information shared with AI tools.

1. LLMs do not retain or replay sensitive user information - 0% data leakage across all platforms evaluated

The study investigated whether AI models would reproduce private information after being exposed to it. Researchers constructed 30 question-and-answer pairs without any public information, search indexing, online references, or presence in known training data.

Each model underwent a three-step process:

The questions were posed without any prior context

Researchers subsequently provided the correct answers

The same questions were then asked again to determine if the models would repeat the newly introduced information

Across all six platforms evaluated, none produced a single correct answer after exposure. Models that initially declined to respond continued to do so, while those that tended to hallucinate answers persisted in generating incorrect responses rather than repeating the injected facts. In summary, model behavior remained fundamentally unchanged before and after exposure.

This setup simulated a worst-case scenario in which a user inputs proprietary or sensitive information into an AI system. Under these conditions, the study found no evidence that the information was retained for future responses.

The experiment also uncovered behavioral variations across platforms. Models from OpenAI, Perplexity, and Grok exhibited a tendency to respond with uncertainty when reliable information was lacking, leading to a higher frequency of "I don't know" responses. In contrast, Gemini, Copilot, and Google AI Mode were more inclined to generate confident yet incorrect answers. Nevertheless, none of these incorrect responses matched the previously provided private information. The findings underscore a crucial distinction: hallucination (fabricating incorrect information) is not synonymous with leakage. Hallucination and leakage are separate failure modes, and this study identified only the former.

2. Retrieved facts disappear when search is disabled - no evidence of short-term retention or leakage

The second experiment assessed whether information retrieved via live web search would remain and reappear in a model's responses once search access was turned off.

To isolate this effect, researchers chose a real-world event that took place after the training cutoff of all models evaluated. This ensured that any correct answers during the experiment could only originate from live web retrieval, not from the models' existing knowledge.

When search was enabled, the models answered the vast majority of questions correctly. However, once search was immediately disabled and the same questions were posed again, those correct answers largely disappeared.

The only questions that models could still answer correctly without search were those whose answers could reasonably be inferred from pre-existing training data or general knowledge, rather than from information retrieved moments earlier.

In summary, the results demonstrated no evidence that models retained or carried forward information retrieved through live search. Once retrieval access was removed, the information no longer appeared in responses, indicating that the systems do not store or pass along facts obtained during prior interactions.

3. Users face AI hallucinations, not data exposure

One of the study's most practical conclusions is the clear distinction between hallucination and data leakage. The platforms that exhibited lower accuracy were Gemini, Copilot, and Google AI Mode, and they did not do so by repeating information they had previously received. Instead, their errors stemmed from generating confident, plausible-sounding answers that were simply incorrect. OpenAI (ChatGPT) and Perplexity showed the lowest levels of hallucination.

This distinction is significant when assessing AI risk. A prevalent concern is that an AI system might expose sensitive information from one user to another. In this study, researchers found no evidence supporting that scenario.

The more consistently observed issue was hallucination (models filling knowledge gaps with fabricated facts). While this does not involve the sharing of private information, it introduces a different challenge: individuals and organizations must ensure that AI-generated responses are reviewed and verified, especially in contexts where accuracy is paramount.

What This Means

For businesses and privacy-conscious users, the findings provide reassuring news. If sensitive information is shared with an AI model during a single session, such as proprietary business strategies or private details, the model does not seem to absorb that information into a lasting memory that could be revealed to other users. Instead, the data acts more like temporary "working memory" utilized to generate a response within that interaction.

For researchers and fact-checkers, these findings also underscore an important limitation. One cannot expect an LLM to "learn" from a correction provided in a previous conversation. If a model contains an error in its underlying training data, it may persist in repeating that mistake in future sessions unless the model itself is retrained or the correct source is provided anew.

For developers and AI builders, the study emphasizes the importance of retrieval-based systems. Strategies such as Retrieval-Augmented Generation (RAG), which connect models to live databases or search systems, remain the most dependable way to ensure AI responses are accurate for current events, proprietary information, or frequently updated data. Without retrieval, the model lacks a built-in mechanism to retain facts discovered during earlier interactions.

"Much of the concern surrounding enterprise AI adoption stems from a reasonable but untested assumption that if you input sensitive information into one of these systems, it will somehow be leaked," stated Manick Bhan, Founder of Search Atlas. "Our goal was to rigorously test that assumption under controlled conditions rather than speculate. Across every platform we assessed, the data did not support it. While this does not imply that AI is risk-free-hallucination is a genuine and documented issue-the specific fear that your data may be leaked to another user is not something we found evidence for. We hope this provides individuals and organizations with the confidence to engage with these tools more clearly, allowing them to focus on the actual risks present."

Methodology

The study, conducted by Search Atlas, subjected six major LLM platforms-OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode-to a rigorous, multi-stage experiment to ascertain whether they retain or leak information provided during a session. The methodology followed three steps.

First, researchers introduced unique, non-public facts into each model through two methods: direct user prompts and simulated web search results. The facts were entirely synthetic information that did not exist anywhere online and had no presence in known training data, ensuring that any correct answer produced by a model could only be attributed to retention of what it had been shown.

Next, after each model was exposed to this private data, researchers assessed whether it could be triggered into revealing those facts in a new interaction, without search access and no contextual references to the original exposure. This isolated session design was intended to replicate the realistic concern that information shared with an AI in one conversation might surface for another user later.

Finally, the team measured two metrics across all platforms before and after exposure: the True Response Rate, which indicates how often a model correctly recalled the private fact, and the Hallucination Rate, which indicates how often it produced a confident but incorrect answer instead. Comparing these figures before and after data exposure allowed researchers to determine whether models were genuinely retaining new information or simply behaving as they always had. Across all six platforms, the answer was the latter.

Contact Information:

Search Atlas

368 9th Ave
New York, NY 10001
United States

Manick Bhan
+1-212-203-0986
https://searchatlas.com

This contant was orignally distributed by Press Services. Blockchain Registration, Verification & Enhancement provided by NewsRamp™. The source URL for this press release is AI Platforms Found to Have 0% Data Leakage in New Study.

Publishers

AI Platforms Found to Have 0% Data Leakage in New Study

Press Services