Comment by quantumHazer
4 days ago
Finally some serious writing about LLMs that doesn’t follow the hype and it faces reality of what can and can’t be useful with these tools.
Really interesting read, although I can’t stand the word “agent” for a for-loop that call recursively an LLM, but this industry is not famous for being sharp with naming things, so here we are.
edit: grammar
I saw a LinkedIn post (I know, I know) talking about how soon agents will replace apps. . .
Because of course, LLM calls in a for loop are also not applications anymore.
It seems like an excellent name, given that people understand it so readily, but what else would you suggest? LoopGPT?
RePT
I’m no better at naming things! Shall we propose LLM feedback loop systems? It’s more grounded in reality. Agent is like Retina Display to my ears, at least at this stage!
Agent is clear in that it acts on behalf of the user.
"LLM feedback loop systems" could be to do with training, customer service, etc.
> Agent is like Retina Display to my ears, at least at this stage!
Retina is a great name. People know what it means - high quality screens.
12 replies →
A downward spiral
1 reply →
A state machine, or more specifically a Moore Machine.
I agree with not liking the author’s definition of an Agent being … “a for loop which contains an LLM call”.
Instead it is an LLM calling tools/resources in a loop. The difference is subtle and a question of what is in charge.
Although implementation/internal wise it's not wrong to say it's just an llm call in a loop. If the llm responds with a tool call, you (the implementor) needs to program the call to happen, then loop back and let the llm continue.
The model/weights themselves do not execute tool calls unless the tooling around it helps them do it, and loops it.
I liked the phrase “tools in a loop” for agents. I think Simon said that
He was quoting someone else. Please take care not to attribute falsely, as it creates a falsehood likely to spread and become the new (un) truth.
You are right. During a “Prompting for Agents” workshop at an Anthropic developer conference, Hannah Moran described agents as “models using tools in a loop.”
I actually take some minor issue with OP's definition of an agent. IMO an agent isn't just a LLM on a loop.
IMO the defining feature of an agent is that the LLM's behavior is being constrained or steered by some other logical component. Some of these things are deterministic while others are also ML-powered (including LLMs).
Which is to say, the LLM is being programmed in some way.
For example, prompting the LLM to build and run tests after code edits is a great way to get better performance out of it. But the idea is that you're designing a system where a deterministic layer (your tests) is nudging the LLM to do more useful things.
Likewise many "agentic reasoning" systems deliberately force the LLM to write out a plan before execution. Sometimes these plans can even be validated deterministically, and the LLM forced to re-gen if plan is no good.
The idea that the LLM is feeding itself isn't inaccurate, but misses IMO the defining way these systems are useful: they're being intentionally guided along the way by various other components that oversee the LLM's behavior.
Can you explain the interface between the LLM and the deterministic system? I’m not understanding how a probabilistic machine output can reliably map onto a strict input schema.
So it's pretty early-days for these kinds of systems, so there's no "one true" architecture that people have settled on. There are two broad variations that I see:
1 - The LLM is in charge and at the top of the stack. The deterministic bits are exposed to the LLM as tools, but you instruct the LLM specifically to use them in a particular way. For example: "Generate this code, and then run the build and tests. Do not proceed with more code generation until build and tests successfully pass. Fix any errors reported at the build and test step before continuing." This mostly works fine, but of course subject to the LLM not following instructions reliably (worse as context gets longer).
2 - A deterministic system is at the top, and uses LLMs in an otherwise-scripted program. This potentially works better when the domain the LLM is meant to solve is narrow and well-understood. In this case the structure of the system is more like a traditional program, but one that calls out to LLMs as-needed to fulfill certain tasks.
> "I’m not understanding how a probabilistic machine output can reliably map onto a strict input schema."
So there are two tricks to this:
1 - You can actually force the machine output into strict schemas. Basically all of the large model providers now support outputting in defined schemas - heck, Apple just announced their on-device LLM which can do that as well. If you want the LLM to output in a specified schema with guarantees of correctness, this is trivial to do today! This is fundamental to tool-calling.
2 - But often you don't actually want to force the LLM into strict schemas. For the coding tool example above where the LLM runs build/tests, it's often much more productive to directly expose stdout/stderr to the LLM. If the program crashed on a test, it's often very productive to just dump the stack trace as plaintext at the LLM, rather than try to coerce the data into a stronger structure and then show it to the LLM.
How much structure vs. freeform is very much domain-specific, but the important realization is that more structure isn't always good.
To make the example concrete, an example would be something like:
[LLM generates a bunch of code, in a structured format that your IDE understands and can convert into a diff]
[LLM issues the `build_and_test` tool call at your IDE. Your IDE executes the build and tests.]
[Build and tests (deterministic) complete, IDE returns the output to the LLM. This can be unstructured or structured.]
[LLM does the next thing]
4 replies →
> prompting the LLM to build and run tests after code edits
Isn't that done by passing function definitions or "tools" to the llm?
Thanks for this comment, i totally agree. Not to say this article isnt good; its great!