The fact that data and instructions are inherently intermixed in most LLMs.
Once either gets into the LLM layer, the LLM can't tell which is which, so one can be treated as the other.
Solutions usually involve offloading some processing to deterministic, non-AI systems which differentiate between the two (like a regular computer program (ignore reflection)), which is the opposite of a "do it all in AI" push from businesses.
The deterministic mixed with LLM approach has been great for me so far. I've been getting a lot of the gains the "do it all with AI" people have been preaching but with far fewer pitfalls. It's sometimes not as fluid as what you sometimes see with the full-LLM-agent setups but that's perfectly acceptable to me and I handle those issues on a case-by-case basis.
I'd argue that the moment one cares about accuracy and blast radius, one would natural want to reduce error compounding from a combination of LLM calls (non deterministic) and it's very natural to defer to well tested determinist tools.
Do one thing and do it well building blocks and the LLM acts a translation layer with reasoning and routing capabilities. Doesn't matter if it's one or an orchestrated swarm of agents.
The fact that data and instructions are inherently intermixed in most LLMs.
Once either gets into the LLM layer, the LLM can't tell which is which, so one can be treated as the other.
Solutions usually involve offloading some processing to deterministic, non-AI systems which differentiate between the two (like a regular computer program (ignore reflection)), which is the opposite of a "do it all in AI" push from businesses.
The deterministic mixed with LLM approach has been great for me so far. I've been getting a lot of the gains the "do it all with AI" people have been preaching but with far fewer pitfalls. It's sometimes not as fluid as what you sometimes see with the full-LLM-agent setups but that's perfectly acceptable to me and I handle those issues on a case-by-case basis.
I'd argue that the moment one cares about accuracy and blast radius, one would natural want to reduce error compounding from a combination of LLM calls (non deterministic) and it's very natural to defer to well tested determinist tools.
Do one thing and do it well building blocks and the LLM acts a translation layer with reasoning and routing capabilities. Doesn't matter if it's one or an orchestrated swarm of agents.
https://alexhans.github.io/posts/series/evals/error-compound...
2 replies →