Comment by thecupisblue

6 months ago

Always the same with every tech hype-train.

People start developing protocol, standards and overengineering abstractions to get free PR and status. Since AI hype started we have seen so many concepts built upon the basic LLM, from Langchain to CoT chains to MCP to UTCP.

I even attended a conference where one of the speakers was adamant that you couldn't "chain model responses" until Langchain came out. Over and over again, we build these abstractions that distance us from the lower layers and the core technology, leaving people with huge knowledge gaps and misunderstanding of it.

And with LLM's, this cycle got quite fast and it's impact in the end is highly visible - these tools do nothing but poison your context, offering you less control over the response and tie you into their ecosystem.

Every time I tried just listing a list of available functions with a basic signature like:

fn run_search(query: String, engine: String oneOf Bing, Google, Yahoo)

it provided better and more efficient results than poisoning the context with a bunch of tool definitions because "oooh tool calling works that way".

Making a simple monad interface beats using langchain by a margin, and you get to keep control over its implementation and design rather than having to use a design made by someone who doesn't see the pattern.

Keeping control over what goes into the prompt gives you way better control over the output. Keeping things simple gives you a way better control over the flow and architecture.

I don't care that your favorite influencer says differently. If you go and build, you'll experience it directly.

9 comments

thecupisblue

IanCal 6 months ago

How do you pull out the call? Parse the response? Deal with invalid calls? Encode and tie results to the original call? Deal with error states? Is it custom work to bring in each new api or do you have common pieces dealing with, say, rest APIs or shelling out, etc?

Lots of this isn’t project specific in what you suggest as a better approach.

If your setup keeps working better then it’s probably got a lot of common pieces that could be reused, right? Or do you write the parsing from scratch each time?

If it’s reused, then is it that different from creating abstractions?

As an aside - models are getting explicitly trained to use tool calls rather than custom things.

thecupisblue 6 months ago
You parse it. Invalid calls you revalidate with a model of your choice. Parsing isn't a hard to solve thing, it's easy and you can parse whatever you want. I've been parsing responses from LLM's since days of Ada and DaVinci where they would just complete the text and it really isn't that hard.
> Deal with invalid calls? Encode and tie results to the original call? Deal with error states? Is it custom work to bring in each new api or do you have common pieces dealing with, say, rest APIs or shelling out, etc?
Why would any LLM framework deal with that? That is your basic architecture 101. I don't want to stack another architecture on top of an existing one.
>If it’s reused, then is it that different from creating abstractions?
Because you have control over the abstractions. You have control over what goes into the context. You have control over updating those abstractions and prompts based on your context. You have control over choosing your models instead of depending on models supported by the library or the tool you're using.
>As an aside - models are getting explicitly trained to use tool calls rather than custom things.
That's great,but also they are great at generating code and guess what the code does? Calls functions.
- IanCal 6 months ago
  
  I’m not saying they’re hard I’m saying they’re common problems that don’t need solving each time. I don’t re-solve title casing every time I need it.
  > Because you have control over the abstractions.
  And depending on what you’re using you have that with other libraries/etc.
  > That's great,but also they are great at generating code and guess what the code does? Calls functions.
  Yep, and a lot more so it depends how well you’re sandboxing that I guess.

OutOfHere 6 months ago

That's all fine, but it should be noted that proper tool-calling using the LLM's structured response functionality guarantees a compliant response because invalid responses are culled as they're generated.

thecupisblue 6 months ago

But now you're limited by the model, provider and the model's adherence to the output.
While using structured outputs is great, it can cause large performance impacts and you lose control over it - i.e. using a smaller model via groq fix the invalid response often times works faster than having a large model generate a structured response.
Have 50 tools? It's faster and more precise to just stack 2 small models or do a search and just pass in basic definitions for each and have it output a function call than to feed it all 50 tools defined as JSON.
While structured response itself is fine, it really depends on the usecase and on the provider. If you can handle the loss of compute seconds, yeah it's great. If you can't, then nothing beats having absolute control over your provider, model and output choice.

HugoMoran 6 months ago

This approach makes sense when integrating LLMs directly into your application.

However, you still need a protocol for the reverse scenario—when your application needs to integrate with an LLM provider's interface.

For many applications, integrating into the user's existing chat interface is far more valuable than building a custom one. Currently, MCP is the leading option for this, though I haven't yet found any MCP implementations that are genuinely useful.

There are significant advantages to avoiding custom LLM integrations: users can leverage their existing LLM subscriptions, you maintain less code, and your product can focus on its core use case rather than building LLM interfaces.

While this approach won't suit every application, it will likely be the right choice for most.

smokel 6 months ago

While I might agree with your standpoint, how is this different from also influencing?

I've seen a lot of influencers suggest "100% assembly", "JavaScript only", "no SQL", which seem quite similar.

thecupisblue 6 months ago

Technically yes, and I've caught myself in a bit of a paradoxal conundrum :)
Think there is a curve of "reason" to apply when someone is advocating something like this, especially about technology and abstractions.
While in most places adding abstractions to core technology makes sense since "it makes it easier to use/manage/deploy" and it is reasonable to use it, LLM's are a quite different case than usual.
Because usually going downstream makes it harder (i.e. going 100% assembly or 100% JS is a harder thing), but going 100% pure LLM is an easier thing - you don't have to learn new frameworks, no need to learn new abstractions, it is shareable, easy to manage and readable by everyone.
In this case, going upstream is what makes it harder, turns it into code management, makes it harder to reason about and adds inevitable complexity.
If you add a new person on your team and they see that you are using 100% assembly, they have to onboard to it, learn how it works, learn why this was done this way etc etc.
If you add a new person to your team and you see that they are using all these tools and abstractions on top of LLMs its the same.
But if you are just using the core tech, they can immediately understand what is going on. No wrapped prompts, magic symbols, weird abstractions - "oh this is an agent but this is a chain while this is a retriever which is also an agent but it can only be chained to a non-retriever that uses UTCP to call it".
So as always, it is subjective and any advocacy needs to be applied to a curve of reason - in the end, does it make sense?