Comment by zbentley
2 hours ago
That's not even slightly the same thing.
A buffer overflow has nothing to do with differentiating a command from data; it has to do with mishandling commands or data. An overflow-equivalent LLM misbehavior would be something more like ... I don't know, losing the context, providing answers to a different/unrelated prompt, or (very charitably/guessing here) leaking the system prompt, I guess?
Also, buffer overflows are programmatic issues (once you fix a buffer overflow, it's gone forever if the system doesn't change), not an operational characteristics (if you make an LLM really good at telling commands apart from data, it can still fail--just like if you make an AC distributed system really good at partition tolerance, it can still fail).
A better example would be SQL injection--a classical failure to separate commands from data. But that, too, is a programmatic issue and not an operational characteristic. "Human programmers make this mistake all the time" does not make something an operational characteristic of the software those programmers create; it just makes it a common mistake.
You are arguing semantics that don't address the underlying issue of data vs. command.
While I agree that SQL injection might be the technically better analogy, not looking at LLMs as a coding platform is a mistake. That is exactly how many people use them. Literally every product with "agentic" in the title is using the LLM as a coding platform where the command layer is ambiguous.
Focusing on the precise definition of a buffer overflow feels like picking nits when the reality is that we are mixing instruction and data in the same context window.
To make the analogy concrete: We are currently running LLMs in a way that mimics a machine where code and data share the same memory (context).
What we need is the equivalent of an nx bit for the context window. We need a structural way to mark a section of tokens as "read only". Until we have that architectural separation, treating this as a simple bug to be patched is underestimating the problem.
> the reality is that we are mixing instruction and data in the same context window.
Absolutely.
But the history of code/data confusion attacks that you alluded to in GP isn’t an apples-to-apples comparison to the code/data confusion risks that LLMs are susceptible to.
Historical issues related to code/data confusion were almost entirely programmatic errors, not operational characteristics. Those need to be considered as qualitatively different problems in order to address them. The nitpicking around buffer overflows was meant to highlight that point.
Programmatic errors can be prevented by proactive prevention (e.g. sanitizers, programmer discipline), and addressing an error can resolve it permanently. Operational characteristics cannot be proactively prevented and require a different approach to de-risk.
Put another way: you can fully prevent a buffer overflow by using bounds checking on the buffer. You can fully prevent a SQL injection by using query parameters. You cannot prevent system crashes due to external power loss or hardware failure. You can reduce the chance of those things happening, but when it comes to building a system to deal with them you have to think in terms of mitigation in the event of an inevitable failure, not prevention or permanent remediation of a given failure mode. Power loss risk is thus an operational characteristic to be worked around, not a class of programmatic error which can be resolved or prevented.
LLMs’ code/data confusion, given current model architecture, is in the latter category.