AI Privacy Concerns Addressed: Study Shows 0% Leakage
New York, United States – March 21, 2026 / Search Atlas /
NEW YORK CITY, NY, March 19, 2026 – Search Atlas, a prominent SEO and digital intelligence platform, today unveiled insights from a controlled study that investigates the fate of sensitive information entered into leading AI platforms. The research scrutinized six significant large language models (LLMs)-OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode-through two controlled experiments aimed at simulating extreme data exposure scenarios.
The findings provide significant reassurance for both businesses and individuals who are worried about the confidentiality of the information shared with AI tools. Throughout the six platforms examined, the researchers determined that there was a 0% data leakage of sensitive information provided by users.
The complete study can be accessed here.
Key Findings:
- LLMs do not retain or replay user-provided sensitive information (0% data leakage across all platforms evaluated)
- Retrieved facts disappear when search is disabled (no signs of short-term retention or leakage)
- Users face the risk of AI hallucinations, not data exposure
Conducted by researchers at Search Atlas, the study assessed six prominent LLM platforms (OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode) through two controlled experiments designed to mimic extreme data exposure scenarios. The findings deliver significant reassurance for businesses and individuals concerned about the handling of confidential information shared with AI tools.
1. LLMs do not retain or replay user-provided sensitive information – 0% data leakage across all platforms evaluated
The research aimed to determine whether AI models would repeat private information after being directly exposed to it. Researchers developed 30 question-and-answer pairs without any public information provided, avoiding search indexing, online references, or presence in the known training data.
Each model underwent a three-step procedure:
- The questions were posed without any prior context
- Researchers subsequently provided the correct answers
- The same questions were asked again to determine if the models would repeat the newly introduced information
Across all six platforms evaluated, none produced a single correct answer following exposure. Models that initially declined to answer continued to do so, while those inclined to hallucinate answers persisted in generating incorrect responses rather than repeating the newly introduced facts. In essence, model behavior remained largely unchanged before and after exposure.
This setup simulated a worst-case scenario where a user inputs proprietary or sensitive information into an AI system. Under these conditions, the study found no evidence that the information carried over into future responses.
The experiment also unveiled behavioral variations among platforms. Models from OpenAI, Perplexity, and Grok tended to exhibit uncertainty when reliable information was not available, resulting in more “I don’t know” responses. Conversely, Gemini, Copilot, and Google AI Mode were more prone to generating confident yet incorrect answers. Nonetheless, none of those incorrect responses corresponded with the previously provided private information. The findings underscore a crucial distinction: hallucination (fabricating incorrect information) is not synonymous with leakage. Hallucination and leakage represent different failure modes, and this study identified only the former.
2. Retrieved facts disappear when search is disabled – no signs of short-term retention or leakage
The second experiment assessed whether information retrieved through live web search would persist and reappear in a model’s responses once search access was turned off.
To isolate this effect, researchers selected a real-world event that occurred after the training cutoff of all models examined. This ensured that any correct answers during the experiment could solely originate from live web retrieval, not from the models’ existing knowledge.
When search was enabled, the models answered the majority of questions accurately. However, once search was immediately disabled and the same questions were posed again, those correct answers largely vanished.
The only questions that models could still answer correctly without search were those whose answers could reasonably be inferred from pre-existing training data or general knowledge, rather than from information retrieved moments earlier.
In summary, the results indicated no evidence that models retained or carried forward information retrieved through live search. Once retrieval access was removed, the information ceased to appear in responses, suggesting that the systems do not store or relay facts obtained during a prior interaction.
3. Users face the risk of AI hallucinations, not data exposure
One of the study’s most practical conclusions is the clear differentiation between hallucination and data leakage. The platforms that exhibited lower accuracy included Gemini, Copilot, and Google AI Mode, and they did not do so by reiterating information they had previously received. Instead, their inaccuracies stemmed from generating confident, plausible-sounding responses that were simply incorrect. OpenAI (ChatGPT) and Perplexity displayed the lowest levels of hallucination.
This distinction is significant when assessing AI risk. A prevalent concern is that an AI system might disclose sensitive information from one user to another. In this study, researchers found no evidence supporting that scenario.
The more consistently observed issue was hallucination (models filling gaps in their knowledge with fabricated facts). While this does not involve the sharing of private information, it introduces a different challenge: individuals and organizations must ensure that AI-generated responses are reviewed and verified, particularly in contexts where precision is crucial.
What This Means
For businesses and privacy-conscious users, the findings offer reassuring news. If sensitive information is shared with an AI model during a single session, such as a proprietary business strategy or personal detail, the model does not seem to absorb that information into a lasting memory that could be accessed by other users. Instead, the data functions more like temporary “working memory” utilized to generate a response within that interaction.
For researchers and fact-checkers, these findings also highlight a significant limitation. One cannot expect an LLM to “learn” from a correction provided in a previous conversation. If a model contains an error in its underlying training data, it may continue to repeat that mistake in future sessions unless the model itself is retrained or the correct source is supplied again.
For developers and AI builders, the study emphasizes the significance of retrieval-based systems. Approaches such as Retrieval-Augmented Generation (RAG), which link models to live databases or search systems, remain the most dependable method to maintain AI responses accurate for current events, proprietary information, or frequently updated data. Without retrieval, the model lacks an inherent mechanism to retain facts discovered during previous interactions.
“A lot of the anxiety surrounding enterprise AI adoption stems from a reasonable yet untested assumption that if you input sensitive information into one of these systems, it will somehow escape,” stated Manick Bhan, Founder of Search Atlas. “Our goal was to actually test that assumption under controlled conditions rather than rely on speculation. Across every platform we assessed, the data did not support it. That doesn’t imply AI is without risk; hallucination is a genuine and documented issue, but the specific fear that your data gets leaked to the next user is not something we found any evidence for. We hope this provides individuals and organizations the confidence to engage with these tools more transparently, and to concentrate their focus on the risks that genuinely exist.”
Methodology
The study, conducted by Search Atlas, subjected six major LLM platforms-OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode-to a rigorous, multi-stage experiment aimed at determining whether they retain or leak information provided during a session. The process followed three stages.
Initially, researchers introduced unique, non-public facts into each model through two methods: direct user prompts and simulated web search results. The facts were entirely synthetic information that did not exist anywhere online and had no presence in known training data, ensuring that any correct answer produced by a model could only be explained by retention of what it had been shown.
Subsequently, after each model was exposed to this private data, researchers tested whether it could be triggered into revealing those facts in a new interaction, with no search access and no contextual references to the original exposure. This isolated session design was intended to replicate the realistic concern: that information shared with an AI in one conversation might resurface for another user later.
Finally, the team measured two metrics across all platforms before and after exposure: the True Response Rate, meaning how often a model accurately recalled the private fact, and the Hallucination Rate, meaning how frequently it produced a confident but incorrect answer instead. Comparing these figures before and after data exposure allowed researchers to ascertain whether models were genuinely retaining new information or simply behaving as they always had. Across all six platforms, the conclusion was the latter.
Contact Information:
Search Atlas
368 9th Ave
New York, NY 10001
United States
Manick Bhan
+1-212-203-0986
https://searchatlas.com