← Back to context

Comment by mhher

4 days ago

The current hype around agentic workflows completely glosses over the fundamental security flaw in their architecture: unconstrained execution boundaries. Tools that eagerly load context and grant monolithic LLMs unrestricted shell access are trivial to compromise via indirect prompt injection.

If an agent is curling untrusted data while holding access to sensitive data or already has sensitive data loaded into its context window, arbitrary code execution isn't a theoretical risk; it's an inevitability.

As recent research on context pollution has shown, stuffing the context window with monolithic system prompts and tool schemas actively degrades the model's baseline reasoning capabilities, making it exponentially more vulnerable to these exact exploits.

I think this is basically obvious to anyone using one of these but they're just they like the utility trade off like sure it may leak and exfiltrate everything somewhere but the utility of these tools is enough where they just deal with that risk.

  • While I understand the premise I think this is a highly flawed way to operate these tools. I wouldn't want to have someone with my personal data (whichever part) that might give it to anyone who just asks nicely because the context window has reached a tipoff point for the models intelligence. The major issue is a prompt attack may have taken place and you will likely never find out.

  • It feels to me there are plenty of people running these because "just trust the AI bro" who are one hallucination away from having their entire bank account emptied.

    • Exactly, I've seen people who bought a Mac Mini and ended up running claw against a claude subscription. Completely misunderstand the point of local models. Plus, there was more hype about running claw way cheaper on Raspberry Pi which cost the stock price of Raspberry maker to skyrocket.

      Some of the comments here show that technical people set these things up for non-technical people, which is just one step away from a misstep. Time will show whether this is similar in behavior to the "I can run it" mindset that people had with local models before. A small dopamine hit to see "it can be done" in order to end up a cloud service in the long run.

Information Flow Control is highly idealistic unless there are global protocol changes across any sort of integration channel to deem trusted vs untrusted.