ChatGPT's Blind Trust: The Achilles' Heel for Phishing Attacks
ChatGPT's inability to distinguish between its own content and attacker-injected Markdown exposes it to phishing attacks. OpenAI remains silent on fixing the issue.
In a recent discovery, ChatGPT faces a troubling security issue. It can't differentiate its own generated content from potentially malicious external Markdown. This vulnerability, reported by Andi Ahmeti, a threat hunter at Permiso, exposes the AI to phishing attacks.
The Anatomy of the Threat
Here's the breakdown: imagine you ask ChatGPT to summarize a webpage. If that page harbors hidden instructions, those could easily turn into a malicious payload. Ahmeti demonstrated how attackers could use this to inject phishing URLs or fake security alerts right into ChatGPT's responses. It's like whispering commands to a chatbot that unquestioningly obeys.
But it doesn't stop there. Ahmeti went another step further, showing that a phishing scheme could pivot from a victim’s browser to their mobile device. How? By embedding a QR code within the chatbot’s response. Scan it, and bam, you're whisked off to an attacker-controlled site, conveniently bypassing traditional desktop security measures.
OpenAI's Radio Silence
Ahmeti flagged this issue with OpenAI back in April via the Bugcrowd platform. Despite revisions and follow-ups, the response was less than encouraging. The vulnerability was either labeled not reproducible or as a duplicate. This lack of response raises the question: why hasn’t OpenAI prioritized fixing such a glaring security risk?
Given OpenAI's silence, it's better to assume this vulnerability still exists. So, if you're asking ChatGPT to summarize pages, tread carefully. Prompt injection isn’t just about model alignment. It’s becoming a core security concern affecting browsers, tools, and potentially even more systems.
The Bigger Picture
Ahmeti's findings highlight a growing issue in AI systems: they're starting to resemble browsers or operating systems, each with a massive security surface ripe for exploitation. The real story here's the blind trust these systems place in external content. Shouldn't AI, especially an influential one like ChatGPT, be designed to scrutinize everything it processes?
AI-generated content should be treated as untrusted. With prompt injection becoming a prevalent issue, it's a reminder to be skeptical of what these models output. After all, fundraising isn’t traction. What matters is whether anyone's actually using this responsibly, and safely.
Get AI news in your inbox
Daily digest of what matters in AI.