Comment by sharts
16 hours ago
One might argue that it’s not too too different from higher level abstractions when using libraries. You get things done faster, write less code, library handles some internal state/memory management for you.
Would one be uneasy about calling a library to do stuff than manually messing around with pointers and malloc()? For some, yes. For others, it’s a bit freeing as you can do more high-level architecture without getting mired and context switched from low level nuances.
I see this comparison made constantly and for me it misses the mark.
When you use abstractions you are still deterministically creating something you understand in depth with individual pieces you understand.
When you vibe something you understand only the prompt that started it and whether or not it spits out what you were expecting.
Hence feeling lost when you suddenly lose access to frontier models and take a look at your code for the first time.
I’m not saying that’s necessarily always bad, just that the abstraction argument is wrong.
I think it's more: when I don't have access to a compiler I am useless. It's better to go for a walk than learn assembly. AI agents turn our high-level language into code, with various hints, much like the compiler.
If my compiler "went down" I could still think through the problem I was trying to solve, maybe even work out the code on paper. I could reach a point where I would be fairly confident that I had the problem solved, even though I lacked the ability to actually implement the solution.
If my LLM goes down, I have nothing. I guess I could imagine prompts that might get it to do what I want, but there's no guarantee that those would work once it's available again. No amount of thought on my part will get me any closer to the solution, if I'm relying on the LLM as my "compiler".
5 replies →
If your compiler produced working executable 20% of the time this would be an apt comparison.
Compilers are deterministic, LLMs are not. They are not "much like".
Still misses the mark. You aren’t useless in the same way because you are still in control of reasoning about the exact code even if you never actually write it.
The difference is that there is a company that can easily take your agents away from you.
Installed on your machine vs. cloud service that's struggling to maintain capacity is an unfair comparison...
> you are still deterministically creating something you understand in depth with individual pieces you understand
You’re overestimating determinism. In practice most of our code is written such that it works most of the time. This is why we have bugs in the best and most critical software.
I used to think that being able to write a deterministic hello world app translates to writing deterministic larger system. It’s not true. Humans make mistakes. From an executives point of view you have humans who make mistakes and agents who make mistakes.
Self driving cars don’t need to be perfect they just need to make fewer mistakes.
Bugs are not non-determinism. There’s a huge difference between writing buggy code and having no idea what the code even looks like.
"When you use abstractions you are still deterministically creating something you understand in depth with individual pieces you understand."
I always thought the point of abstraction is that you can black-box it via an interface. Understanding it "in depth" is a distraction or obstacle to successful abstraction.
> When you use abstractions you are still deterministically creating something you understand in depth with individual pieces you understand
Hard disagree on that second part. Take something like using a library to make an HTTP call. I think there are plenty of engineers who have more than a cursory understanding of what's actually going on under the hood.
It might just be social. When I use the open source http library, much of the reason I use it is because someone has put in the work of making sure it actually works across a diverse set of software and hardware platforms, catching common dumb off by ones, etc.
Sure, the LLM theoretically can write perfect code. Just like you could theoretically write perfect code. In real life though, maintenance is a huge issue
Perhaps then, the better analogy is like being promoted at your company and having people under you doing the grunt work.
How closely you micromanage it is a factor as well though
This is how I’ve come to think of it. Delegation of the details.
It seems like some kind of technique is needed that maximizes information transfer between huge LLM generated codebases and a human trying to make sense of them. Something beyond just deep diving into the codebase with no documentation.
There's a false dichotomy here between 'deterministic creation' and 'vibing'.
I use Claude all day. It has written, under my close supervision¹, the majority of my new web app. As a result I estimate the process took 10x less time than had I not used Claude, and I estimate the code to be 5x better quality (as I am a frankly mediocre developer).
But I understand what the code does. It's just Astro and TypeScript. It's not magic. I understand the entire thing; not just 'the prompt that started it'.
¹I never fire-and-forget. I prompt-and-watch. Opus 4.7 still needs to be monitored.
In what world to developers “understand” pieces like React, Pandas, or Cuda? Developers only have a superficial understanding of the tools they are developing with.
Some developers, I usually end up fixing bugs in OSS I use
A library is deterministic.
LLMs are not.
That we let a generation of software developers rot their brains on js frameworks is finally coming back to bite us.
We can build infinite towers of abstraction on top of computers because they always give the same results.
LLMs by comparison will always give different results. I've seen it first hand when a $50,000 LLM generated (but human guided) code base just stops working an no one has any idea why or how to fix it.
Hope your business didn't depend on that.
Why would that necessarily happen? With an LLM you have perfect knowledge of the code. At any time you can understand any part of your code by simply asking the LLM to explain it. It is one of the super powers of the tools. They also accelerate debugging by allowing you to have comprehensive logging. With that logging the LLM can track down the source of problems. You should try it.
> With an LLM you have perfect knowledge of the code. At any time you can understand any part of your code by simply asking the LLM to explain it.
The LLM will give you an explanation but it may not be accurate. LLMs are less reliable at remembering what they did or why than human programmers (who are hardly 100% reliable).
Determinism is a smaller point than existence of a spec IMHO. A library has a specification one can rely on to understand what it does and how it will behave.
An LLM does not.
The thing is, it's possible to ask the LLM to add dynamic tracing, logging, metrics, a debug REPL, whatever you want to instrument your codebase with. You have to know to want that, and where it's appropriate to use. You still have to (with AI assistance) wire that all up so that it's visible, and you have to be able to interpret it.
If you didn't ask for traceability, if you didn't guide the actual creation and just glommed spaghetti on top of sauce until you got semi-functional results, that was $50k badly spent.
And if that had been done the $50k code base would be a $5,000,000 code base because the context would be 10 times as large and LLMs are quadratic.
If only we taught developers under 40 what x^2 meant instead of react.
2 replies →
Libraries are not deterministic. CPUs aren’t deterministic. There are margins of error among all things.
The fact that people who claim to be software developers (let alone “engineers”) say this thing as if it is a fundamental truism is one of the most maladaptive examples of motivated reasoning I have ever had the misfortune of coming across.
I would argue it couldn't be more different. I can dive into the source code of any library, inspect it. I can assess how reliable a library is and how popular. Bugs aside, libraries are deterministic. I don't see why this parallel keeps getting made over and over again.
I can dive into the source code of LLM generated code too. Indeed it is better because you have tools to document it better than a library that you use.
> Would one be uneasy about calling a library to do stuff than manually messing around with pointers and malloc()?
The irony is that the neverending stream of vulnerabilities in 3rd-party dependencies (and lately supply-chain attacks) increasingly show that we should be uneasy.
We could never quite answer the question about who is responsible for 3rd-party code that's deployed inside an application: Not the 3rd-party developer, because they have no access to the application. But not the application developer either, because not having to review the library code is the whole point.
> because not having to review the library code is the whole point.
That’s just not true at bigger companies that actually care about security rather than pretending to care about security. At my current and last employer, someone needs to review the code before using third-party code. The review is probably not enough to catch subtle bugs like those in the Underhanded C Contest, but at least a general architecture of the library is understood. Oh, and it helps that the two companies were both founded in the twentieth century. Modern startups aren’t the same.
I feel like big / old companies thrive on process and are bogged down in bureaucracy.
Sure there is a process to get a library approved, and that abstraction makes you feel better but for the guy who's job it is to approve they are not going to spend an entire day reviewing a lib. The abstraction hides what is essentially a "LGTM" its just that takes a week for someone to check it off their outlook todos.
Maybe your experience is different.
I think it's not too different in that specific sense, but it's more than that. To bring libraries on equal footing, imagine they were cloud only, had usage limits.
I'm also somewhat addicted to this stuff, and so for me it's high priority to evaluate open models I can run on my own hardware.
I hate this comparison because you're comparing a well defined deterministic interface with LLM output, which is the exact opposite.
A library doesn't randomly drop out of existence cause of "high load" or whatever and limit you to a some number of function calls per day. With local models there's no issue, but this API shit is cancer personified, when you combine all the frontend bugs with the flaky backend, rate limits, and random bans it's almost a literal lootbox where you might get a reply back or you might get told to fuck off.
Qwen has become a useful fallback but it's still not quite enough.