A Single Poisoned Document Could Leak ‘Secret’ Data Via ChatGPT

by Bella Baker


The latest generative AI models are not just stand-alone text-generating chatbots—instead, they can easily be hooked up to your data to give personalized answers to your questions. OpenAI’s ChatGPT can be linked to your Gmail inbox, allowed to inspect your GitHub code, or find appointments in your Microsoft calendar. But these connections have the potential to be abused—and researchers have shown it can take just a single “poisoned” document to do so.

New findings from security researchers Michael Bargury and Tamir Ishay Sharbat, revealed at the Black Hat hacker conference in Las Vegas today, show how a weakness in OpenAI’s Connectors allowed sensitive information to be extracted from a Google Drive account using an indirect prompt injection attack. In a demonstration of the attack, dubbed AgentFlayer, Bargury shows how it was possible to extract developer secrets, in the form of API keys, that were stored in a demonstration Drive account.

The vulnerability highlights how connecting AI models to external systems and sharing more data across them increases the potential attack surface for malicious hackers and potentially multiplies the ways where vulnerabilities may be introduced.

“There is nothing the user needs to do to be compromised, and there is nothing the user needs to do for the data to go out,” Bargury, the CTO at security firm Zenity, tells WIRED. “We’ve shown this is completely zero-click; we just need your email, we share the document with you, and that’s it. So yes, this is very, very bad,” Bargury says.

OpenAI did not immediately respond to WIRED’s request for comment about the vulnerability in Connectors. The company introduced Connectors for ChatGPT as a beta feature earlier this year, and its website lists at least 17 different services that can be linked up with its accounts. It says the system allows you to “bring your tools and data into ChatGPT” and “search files, pull live data, and reference content right in the chat.”

Bargury says he reported the findings to OpenAI earlier this year and that the company quickly introduced mitigations to prevent the technique he used to extract data via Connectors. The way the attack works means only a limited amount of data could be extracted at once—full documents could not be removed as part of the attack.

“While this issue isn’t specific to Google, it illustrates why developing robust protections against prompt injection attacks is important,” says Andy Wen, senior director of security product management at Google Workspace, pointing to the company’s recently enhanced AI security measures.



Source link

Related Posts

Leave a Comment