Comment by lionkor
9 hours ago
LLMs are not another abstraction. ALL OTHER LAYERS you named are fully deterministic, understood, debuggable, etc.
You cannot be serious.
9 hours ago
LLMs are not another abstraction. ALL OTHER LAYERS you named are fully deterministic, understood, debuggable, etc.
You cannot be serious.
A non-deterministic layer seems like exactly what would need a competent, professional to ensure a good outcome, so it doesn't follow that LLM usage would depress wage more than high-level languages depressed wages by opening up programming to tens of millions of people who could never grok assembly.
> high-level languages depressed wages by opening up programming to tens of millions of people who could never grok assembly.
While I agree with your thesis, I don't think your example actually happened. From my experience teaching CS, the fundamental skill when it comes to programming a computer is algorithmic reasoning, that is, the ability to split a task into subtasks until you reach base tools that you exist in your toolbox. Whether those base tools are MOV or document.getElementById() is largely immaterial, IMO. Obviously it's quicker, and easier, for us to not have to drill all the way down to assembly all the time, but if you have a firm grasp of algorithmic reasoning, you are capable of it.
Algorithmic reasoning is a valuable skill across the economy, one of the most valuable, in fact, right after "risk evaluation" and "having a lot of money", but it's difficult to teach. For some people it seems to fit right into their mental model, and it's simply a "duh". For most, though, it either takes a ton of grinding their head against it, or more often they simply bounce off and decide to do something else. I think that is responsible for the relatively limited supply of developers, which combined with software being wildly profitable because of the whole "copies are free to make" scaling created a shortage, which drove up wages.
They drove up wages so much that approximately everyone who has any aptitude for algorithmic reasoning is now funneled into software development. I think that, more than anything, contributed to the explosive growth of the number of developers. Adults may be hesitant to jump industries, but subsequent generations flooded in.
Counter-point: most developers have no idea or eagerness to actually do that debugging, so it doesn't really matter.
It DOES matter, because the claim that LLMs are a layer of abstraction implies that it's somehow more than a random word generator. It does a great job at generating words in the right order, and often, given enough time, datacenter resources, money, and training, they can produce code that runs and does things as expected.
However, there is absolutely nothing stopping an LLM from "deciding" tomorrow that a fix it built a week ago is no longer real, because not only has that fix left its context, but also the bug was not obvious.
> However, there is absolutely nothing stopping an LLM from "deciding" tomorrow that a fix it built a week ago is no longer real, because not only has that fix left its context, but also the bug was not obvious.
Yeah, and we've never had deterministic tools like GCC suddenly fuck up commonly-relied-on undefined behavior between releases. Sure.
I get what you're saying, but again, to the vast majority of devs, none of that shit matters. Whether that's a good thing or a bad thing is a different discussion.
1 reply →
LLMs are one of the most general abstractions possible.
LLMs are also quite deterministic if you want them to be - generally, their final token selection is deliberately randomized (the model “temperature”). But the word you’re looking for here is probably not actually determinism, it’s probably something closer to predictability.
In any case, it’s perfectly possible to ensure that the output of LLMs is fully deterministic, debuggable, understandable, and testable.
> You cannot be serious.
I don’t think you’re thinking about this clearly.
> LLMs are also quite deterministic if you want them to be
In the shallow sense that any PRNG is deterministic if you set the seed and if you control triggering order.
However that's not usually the situation/scope people are talking about.
I was just pointing out, in part, that the non-determinism is a choice, but I probably would have needed to go down a whole rabbit hole about exploration of search spaces etc.
My broader point is that it's not really the non-determinism that's an issue. What the other commenter seems to be looking for is something along the lines of repeatable correctness, where correctness is generally a requirement that the model doesn't have full access to. The non-determinism is an implementation detail here.
With a sufficiently complex prompt and a sufficiently complex codebase, LLMs consistently fail and make mistakes, "forget" parts of the prompt, etc.
There's no comparison to be made between this and, for example, a compiler. It's an incompetent comparison.
> I don’t think you’re thinking about this clearly.
My literal job is dealing with layers of abstraction. I'm thinking pretty clearly when I tell you that, not only are LLMs a super leaky, terrible abstraction, they are also not comparable to any other layers of abstraction. All other layers of abstraction we use are well understood, predictable (as you put it), and DEBUGGABLE.
When claude deletes a fix it did two weeks ago, while trying to fix some unrelated error, do you never stop and think "this is not quite the same as what GCC does"?
> With a sufficiently complex prompt and a sufficiently complex codebase ...
With a sufficiently complex specification of a failure mode, you can find problems with anything.
Humans, given sufficiently complex requirements and sufficiently complex codebases, also regularly fail. You're tacitly admitting that LLMs are approaching (if not exceeding) human levels of performance now. We somehow get non-deterministic humans to achieve useful work. In fact, staff provide managers with an abstraction over the work they're responsible for - managers don't know every detail of the systems they're responsible for.
There are effective ways to use LLMs. I recommend using those, not using overly complex prompts, and not letting LLMs freely make changes to large code bases. Just as compilers only compile one source file at a time, LLMs work best if you scope their attention. Same goes for humans, in fact.
> There's no comparison to be made between this and, for example, a compiler.
A simple comparison is that both can generate useful code. You need to be more precise about the issues you're trying to identify.
Anyway, the comparison to compilers isn't really the point. It's undeniable that LLMs are an abstraction themselves, and that they can generate new abstractions. Saying that they're "not another abstraction" is just definitionally wrong.
Sure, they're not the same kind of abstraction as a traditional compiler. They require new ways of working, but actually not that new, as the manager example I gave suggests.
> When claude deletes a fix it did two weeks ago, while trying to fix some unrelated error, do you never stop and think "this is not quite the same as what GCC does"?
I never made the mistake of thinking LLMs were the same as GCC in the first place.
And once again, I've seen human developers do exactly what you just described. That's why we review code. All the arguments you're making are essentially also arguments that humans shouldn't be involved in software development either.