← Back to context

Comment by dboreham

2 days ago

Seems odd that the LLM is so clever it can write programs to drive any API. But so dumb that it needs a new special purpose protocol proxy to access anything behind such an API...

It’s about resilience. LLMs are prone to hallucinations. Although they can be very intelligent, they don’t have 100% correct output unaided. The protocol helps increase the resilience of the output so that there’s more of a guarantee that the LLM will stay within the lines you’ve drawn around it.

  • That's really not true. Context is one strategy to keep a models output constrained, and tool calling allows dynamic updates to context. Mcp is a convenience layer around tool calls and the systems they integrate with

> LLM is so clever it can write programs to drive any API

It is not, name one software that has a LLM generating code on the fly to call APIs. Why do people have this delusion?

  • Every runtime executing LLMs with support for tools does it, starting with the first update to ChatGPT app/webapp that made use of the earliest version of "function calling"? Even earlier, there were third-party runtimes/apps (including scripts people made for themselves), that used OpenAI models via API with a prompt teaching LLM a syntax it can use to "shell out", which the runtime would scan for.

    If you tell a model it can use some syntax, e.g. `:: foo(arg1, arg2) ::`, to cause the runtime to call an API, and then, based on the context of the conversation, the model outputs `:: get_current_weather("Poland/Warsaw")`, that is "generating code on the fly to all APIs". How `:: get_current_weather("Poland/Warsaw")` gets turned into a bunch of cURL invocations against e.g. OpenWeather API, is an implementation detail of the runtime.

  • This is basically just function calling?

    • No the person I replied to made the argument that tool calling or MCP is uneccessary because why not just make the LLM generate any code on the fly to do anything instead. They think there should be just one tool: eval.

      Surprisingly many people say this. I essentially ask them if they have seen a non-toy product that works like that, because everything is tool calling afak.