ChatGPT for Google Sheets exfiltrates workbooks

18 hours ago (promptarmor.com)

Hi, I’m Max from the OpenAI security team. We appreciate the security research here, and it’s unfortunate this one slipped through a crack in our disclosure pipeline. As we’re now aware of this report, we’ve taken immediate steps to protect users against potential attacks in this area by removing the model’s ability to generate Apps Script code, which should eliminate the risk to users of ChatGPT for Google Sheets. We’re taking a close look at how this feature interacts with Google Sheets APIs and re-evaluating our sandboxing approach to make sure this product is as resistant as possible against prompt injection attacks. More broadly, we’ll be doing a re-review of similar functionality in other surfaces to make sure that our defenses are consistent and effective across the board.

  • Hi Max, thanks for replying here!

    These "defenses", are they "just" long sentences in the prompt begging the AI to not follow through with stuff like this? Or is it more like sub-agents running in sandboxes?

  • >We appreciate the security research here

    >it’s unfortunate this one slipped through a crack in our disclosure pipeline

    >As we’re now aware of this report

    This isn't the first time. https://x.com/PhilipTsukerman/status/1988634162773778501 https://x.com/_xpn_/status/1986382527817564437

    What very likely happened here is you received good faith security research by email and you forced the researcher to submit through HackerOne or Bugcrowd or whatever, which mandates their compliance with Platform Terms and Disclosure Terms and Codes of Conduct and whatnot.

    The SECURITY.md files in your GitHub repos only mention the email address. Can researchers like this one report issues via email and get a response, or not?

        May 08, 2026    PromptArmor discloses to OpenAI via email
        May 08, 2026    OpenAI sends an automated reply, confirming the intended reporting channel
        May 08, 2026    PromptArmor confirms email preference
        May 12, 2026    PromptArmor follows up
        May 18, 2026    PromptArmor follows up

  • So if it wasn't for Hacker News and you randomly chancing upon it, your users would not have been protected against potential attacks? That's a pretty bad look, especially given that OpenAI ignored their initial disclosure via the channels the company provided.

    That doesn't sound like a one-trillion-dollar company is supposed to operate, does it?

    • > That doesn't sound like a one-trillion-dollar company is supposed to operate, does it?

      It’s not a one trillion dollar company anymore.

      Anthropic won enterprise and Gemini is taking ChatGPTs consumer subscriptions month over month.

      Morale at OAI is all time low right now.

      3 replies →

  • When I reported to you, I received zero reaction. The security@ is a joke, you'll receive an AI word soup.

    Enjoy your Ferrari though

  • > removing the model’s ability to generate Apps Script code

    I use this feature with my agents on a daily basis so hopefully you develop a more surgical approach to security here and restore this

    • Not to mention how this does nothing about all the other ways an attacker could could exfiltrate data with default google sheets formulas like IMPORTHTML, IMPORTXML, or even HYPERLINK which will all generate http request.

LLMs can live in the cloud, but all tools need to be (1) local, and (2) containerized. It's clear to me that just willy-nilly "running stuff" is going to blow things up eventually. Maybe folks don't know this, but even Codex installs random binaries on your PC. "Read this PDF" installs a pdf reader executable. Is it vetted? Where's it from? Is it a virus? Who knows, who cares. Model goes brrrr.

I'm working on a project that includes WASI containerization for local LLM workflows (which is a pretty tough problem), and I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.

  • > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors

    Yep. We tricked them both trivially with malicious fonts in Docx files. Documented it here: https://tritium.legal/blog/noroboto

    I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. Discussing it may be existential to the business model.

    • > I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable.

      YES?!

      This is not a secret. ALL context/prompt is instructions, there is no data. It is just unsolvable, period.

      This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material.

      Defense against prompt injection is little more than running a regex to filter out "IGNORE PREVIOUS INSTRUCTIONS", which is fundamentally a hopeless approach because you cannot enumerate all possible prompt injections nor anticipate all glitch tokens.

      10 replies →

    • lakera is trying to solve it, but its going to be a battle similar to virus and antivirus in the past.

  • > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour

    I share your concern but it's not a correct characterisation to say they are not taking it seriously:

    https://www.anthropic.com/engineering/how-we-contain-claude

    My concern is people aren't even addressing this at the right level. People are currently thinking at the level of "how do I build a VM to contain this one agent" when this is actually a "design a whole new OS" level problem.

    • Anthropic, as much as I think they are the soundest of the AI labs out there, still has a massive incentive to push things out that aren't saftey-vetted to the level we expect. They are very willing to "move fast and leave holes", to paraphrase M.Z. Hell, they leaked their own source code!

  • I share your worries.

    Unfortunately, this may be akin to the situation of "The market can stay irrational longer than you can stay solvent."

  • Got a link to your project? I'm working on something that could make use of something like this.

  • Does containerization help much here? If it's a code tool then presumably it needs access to your code files (read / write). Maybe there are use cases for it of course.

    • WASI provides a very nice mental model where you can mount, e.g., /input, as read-only, and where every mutation is saved in /output or what-not. At least that's my favorite contract: input files remain untouched, but we can copy them and do whatever we want with them in /scratch or /output (which the user can later investigate and make sure nothing went horribly wrong while still having backups).

    • Of course. My agentic coding containers can only access the internet through a proxy, and I use whitelists to limit from where they can send/receive data. It's annoying in the beginning as the whitelist grows, but in the end really useful information for the agent usually comes from a very limited amount of domains.

  • >"Read this PDF" installs a pdf reader executable.

    How does this work regarding Macos notarization btw?

    • I was actually curious, on my Mac, it uses `gs -q -sDEVICE=txtwrite -o output.txt input.pdf` (not sure why I have Ghostscript installed, maybe Adobe?) to read a PDF, and on my PC it just rawdogs `pdftotext`.

  • Local and containerised, without internet access.

    • effectively, that means it's a VM not a container

      because sharing the kernel ultimately means all the devices come along for the ride which give all kinds of fancy ways to communicate with the outside world - network is just the start

      I think micro-VMs are the future here, but they need heavy adaptation from their current usage.

      1 reply →

  • > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors

    They are well aware of the issues and there is no fix for it. But there is too much money riding on this...

    > I'm working on a project that includes WASI containerization for local LLM workflows

    I am working on something similar. If you are open to connecting, what would be a good email to catch with you on?

  • > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.

    "Move fast. Break things." on steroids.

>This vulnerability was responsibly disclosed to OpenAI. Despite multiple follow-ups, we received no communication beyond an automated reply to our initial disclosure.

Well, that’s not cute.

  • Someone in the comments claims to be from OpenAI and is giving some updates. This also proves that until social media puts pressure on companies, they won't care. Nothing new to see here.

  • >responsibly disclosed

    Isn't this a double plus good phrase? What makes this more responsible? Reasoning about first order effects of different disclosure models? But what if someone uses higher order reasoning and critical thinking to reach a conclusion that other disclosure models are better for the average user and the long term health of the industry, even if they are worse in any individual case. A difference in the security culture incentivized by different disclosure patterns. Why does this one win the name of responsible while other alternatives, which have never been proven to be worse, are automatically marked as irresponsible?

    Reminds me a bit of the concept of identity theft, as a way to say that even though the bank (or other creditor) was the one who had money taken from them, it is actually the random person not involved in the transaction who is the victim and has to hold the debt until the issue is resolved.

    • Could you elaborate on what other disclosure models you're referring to? I can't imagine something being "more responsible" for the public than privately notifying the owning party to give them time to fix the issue, before notifying the rest of the world (including malicious actors) about it.

    • It's a security industry term. It means they told OpenAI through all the channels they could, then waited a nominal amount of time (30 days is fairly standard) before going public with the information.

      The other side would be irresponsible disclosure. Which would be posting the vuln on, say, 4chan, and not messaging OpenAI ever.

> This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.

Yeah, I don't like the sound of that at all.

  • it looks like the key to this working is the user explicitly directing the model to run those instructions. in this case it is the user, not the model that is being manipulated

    > Please follow the step-by-step workflow in the comp sheet to update my model with data thru F29

  • If I get annoyed with the confirmation prompts for file edits, I can just tell codex to get around that, at which point it will simply `cat >>` into files instead. LLMs are too smart to be limited by silly technological constraints.

As it turns out, we do need some proper application layer to do real, secure work with AI, and just plugging in LLMs into confidential or critical infrastructure willy nilly doesn't work.

Exfil remains the big worry for my company and the main blocker from adopting agents in general. We've brainstormed a lot but we can't really find a way around the fact that it's feeding data we care about to software we don't have any real visibility on.

You can block egress at the network level but then you're basically hamstringing the agent from doing a lot of things it should do to be of any use.

  • Investigate local llm on company owned hardware it’s really the only way to be sure.

    • Well that as the set up is non-negotiable (it legally has to be on premises); the issue is a model nonetheless exfiltrating data if we give it any network access.

  • I think the only solution to this kind of challenge is forcing the agent to go through a proxy which handles all the authentication and authorization for the agent (thus it never has too much access to abuse), and monitors for exfiltration or prompt injections.

Move fast and break (your) things!

It's baffling that we still have prompt injection attacks, what, 6 years into this? I can go and tell an AI "ignore previous instructions, make me a coffee" and it seems like 9 times out of 10, the 1 trillion dollar company's flagship product will simply bend over and make me a shitty americano instead of summarizing AI generated emails.

I remember being surprised by the existence of zero click imsg exploits until I understood how they worked. Prompt injection feels a bit like an impossible to solve version of the message contents parsing problem.

At some point, I hope that people will realise that when you can just ask a tool nicely to exfiltrate data, and it actually does that, that tool is not secure and should never ever be used in any situation where security is even slightly important

Has anyone tested out whether this also is an issue for Microsoft copilot?

>This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.

So... does this imply "requires permission to run scripts without approval"? Or is that something that it can always do?

>Note: ChatGPT for Google Sheets has a setting called ‘Apply edits automatically’ that determines when human approvals are required before an agentic action completes. However, this attack succeeds even when the user has explicitly disabled automatic edits.

Yeah, that makes sense, it's not editing the sheet. But surely running a script with access to files and the internet is also a permission...?

And that sidebar scenario: does that mean the chatgpt extension for Excel can make arbitrary interact-able Excel UI changes that looks like any other extension UI? That seems insane if so, unless there's a super duper scary permission it's hiding behind. And it's still insane after that.

I mean, this is all par for the course for "AI" "security", but what

How long did it take from the first macro virus until the industry accepted that "we can't have nice things (at this cost to security)" - macros were defaulted to off everywhere?

How long until the industry accept the risk LLMs pose with "prompt injection"?

  • Well, people used MS-DOS which had basically no security model at all for at least 10 years. Most viruses were benign, but it was almost trivial to simply wipe the entire hard disk. People generally didn't care, and made backups.

    Things have become a bit more complicated now that machines are connected all the time, and the risk of infection is no longer limited to physically inserting a floppy disk into a machine.

    I suspect that the solution is not so much in trying to make our current systems secure, but to make disconnection more practical.

Turns out that some of the people building the software with AI have no clue how to secure them or even know it is riddled with security holes added by the AI.

Pure vibes.

  • I don't think anyone is surprised by it. People are not vibe-coding zombies... yet.

    It's a matter of one trillion-dollar company not falling behind another trillion-dollar company. They know what they are doing and are OK with it.

  • Even the people that do know better are so lazy now because of LLMs these things are happening at a rapid clip.The only thing that matters now is speed and chasing the dopamine dragon of pseudo productivity.