← Back to context

Comment by mattmanser

3 days ago

You've missed the subtlety here.

LLMs don't have attention to detail.

This project had extremely comprehensive, easily verifiable, tests.

So the LLM could be as sloppy as they usually arez they just had to keep redoing their work until the code actually worked.

I missed the subtlety?

I linked the paper! I read the paper. Yeah. they wrote the tests, which is how this worked! how the heck do you think it was supposed to work?

the fact that they needed to write the tests was just the means to implementation. It didn't change the non-LLM labor economics of the problem.

  • No, I meant subtlety of definition, you've attributed the diligence to the LLM when in fact it's the tests that provide that.

    You've unfortunately committed the big sin of anthropomorphizing the LLM and calling it diligent.

    An LLM cannot be diligant, it's stochastic so it's literally impossible for it to be diligant.

    Writing all those tests was diligant.

    • I didn’t attribute diligence to anything.

      I’m not worried about the personal character of diligence. I’m interested in what the technology unlocked and how things made with it are materially different in terms of labor configurations.

Who wrote the tests?

  • And how does the answer to your question bear on the claim I’m making?

    • If you're trying to automate all coding activity, writing tests is coding activity. Arguably the greater fraction of effort between implementation, and verifying said implementation. If the only thing making your problem space tractable for the automation to be able to replace the lesser half of coding activity is an authored test suite you couldn't generate via your automation, then you really need to admit that.

      "Did you check?" is the most expensive question, and one of the most feared in my experience in tech circles. Spent quite a few years as a dedicated tester once I developed the knack for it. Everybody gangsta til it's time to prove the damn thing works.

      1 reply →