← Back to context

Comment by hintymad

1 month ago

> I don't think that AIs have become more trustworthy, the errors are just more subtle.

Honest question: what about the counter-argument that humans make subtle mistakes all the time, so why do we treat AI any differently?

A difference to me is that when we manually write code, we reason about the code carefully with a purpose. Yes we do make mistakes, but the mistakes are grounded in a certain range. In contrast, AI generated code creates errors that do not follow common sense. That said, I don't feel this differentiation is strong enough, and I don't have data to back it up.

One answer, as another person pointed out, is that LLM mistakes are just different. They are less explicable, less predictable, and therefore harder to spot. I can easily anticipate how an inexperienced engineer is going to mess up their first pull request for my project. I have no idea what an LLM might do. Worse, I know it might ace the first fifty pull requests and then make an absolutely mind-boggling mistake in the 51st one.

But another answer is that human autonomy is coupled to responsibility. For most line employees, if they mess up badly enough, it's first and foremost their problem. They're getting a bad performance review, getting fired, end up in court or even in prison. Because you bear responsibility for your actions, your boss doesn't have to watch what you're up to 24x7. Their career is typically not on the line unless they're deeply complicit in your misbehavior.

LLMs have no meaningful responsibility, so whoever is operating them is ultimately on the hook for what they do. It's a different dynamic. It's probably why most software engineers are not gonna get replaced by robots - your director or VP doesn't want to be liable for an agent that goes haywire - but it's also why the "oh, I have an army of 50 YOLO agents do the work while I'm browsing Reddit" is probably not a wise strategy for line employees.

  • > I can easily anticipate how an inexperienced engineer is going to mess up their first pull request for my project.

    Isn’t this just because you have seen a lot of PRs from inexperienced engineers? People learn LLM behavior over time, too.

    • I'm pretty sure that I've seen more LLM mistakes than coworker mistakes at this point and I'm nowhere closer to enlightenment.

Humans can't make mistakes at the sheer scale that AI can.

Yes, as an engineer I make mistakes, but I could never make as many mistakes per day as an LLM can

  • Obviously, the measure isn’t mistakes per day, it’s mistakes per LOC. And that’s not the whole story either - AI self-corrects in addition to being corrected by the operator. If the operator’s committed bugs/LOC rate is as low as the unaugmented programmer’s bugs/LOC, you always choose the AI operator. If it’s higher, it might still be viable to choose them if you care about velocity more than correctness. I’m a slow, methodical programmer myself, but it’s not clear to me that I have a moat.

This is like having a coworker who's as skilled as you if not more skilled, but also an alien.

Their mental model doesn't map cleanly enough to yours, and so where for a human you'd have some way to follow their thought patterns and identify mistakes, here the alien makes mistakes that don't add up.

Like the alien has encyclopedic knowledge of op codes in some esoteric soviet MCU but sometimes forgets how to look for a function definition, says "It looks like the read tool failed, that's ok, I can just make a mock implementation and comment out the test for now."

  • Some of my favorite peer engineers work exactly like that

    People used to like them and they used to be legends (even if not everyone liked them)

    Notch, Woz, Linus and Geohot come to mind

    The Metasploit creator Dean McNamee worked for me and he was just like that and a total monster at engineering hard tech products

    • No they don't because they have brains.

      I have no strong idea why people can't accept that intelligence formed separately of a human brain can truly be alien: not in the hyperbolic sense of "that person is so unique it's like they're a different species", but "that thing does not have a brain, so it can have intelligence that is not human-like".

      A human without a brain would die. An LLM doesn't have a brain and can do wonderous things.

      It just does them in ways that require first accepting that there is no homo sapien thinks like an LLM.

      We trained it on human language so often times it borrows our thought traces so to speak, but effective agentic systems form when you first erase your preconceived notions of how intelligence works and actually study this non-human intelligence and find new ways to apply it.

      It's like the early days of agents when everyone thought if you just made an agent for each job role in a company and stuck them in a virtual office handing off work to each other it'd solve everything, but then Claude Code took off and showed that a simple brain dead loop could outperform that.

      Now subagents almost always are task specific, not role specific.

      I feel like we could leap ahead a decade if people could divorce "we use language, and it uses language so it is like us", but I think there's just something really challenging about that because it's never been true.

      Nothing had this level of mastery over human language before that wasn't a human. And funnily enough, the first times we even came close (like Eliza) the same exact thing happened: so this seems like a persistent gap in how humans deal with non-humans using language.

      4 replies →

  • Dealing with the alien coworkers has always been the job, that is what software is to most people.

    Software developers get paid big money because they can speak alien, the only thing that is changing is the dialect.

    • Nope, I tried my best to be really detailed and already knew these replies would come flooding.

      I'm an engineers engineer: I get the job isn't LOC but being able to communicate and translate meatspace into composable and robust sustems.

      So when I mean an alien when I say an alien.

      Not human.

      Not in the cute "oh that guy just hears what everyone else hears and somehow interprets it entirely differently like he's from a different planet" alien way, but in the, "it is a different definition of intelligence derived from lacking wetware" alien way.

      Intelligence is such multidimensional concept that all of humanity as varied as we are, can fit in a part of the space that has no overlap with an LLM.

      -

      Now none of that is saying it can't be incredibly useful, but 99% of the misuse and misunderstanding of LLMs stems from humans refusing to internalize that a form of intelligence can exist that uses their language but doesn't occupy the same "space" of thinking that we all operate in, no matter how weird or unqiue we think we are.

> Honest question: what about the counter-argument that humans make subtle mistakes all the time, so why do we treat AI any differently?

We're investing in the human getting better rather than paying $100 to Anthropic and hoping that's enough that they don't make the product worse.