← Back to context

Comment by Snuggly73

2 days ago

Yes, and for some cases no.

The models are gotten very good, but I rather have an obviously broken pile of crap that I can spot immediately, than something that is deep fried with RL to always succeed, but has subtle problems that someone will lgtm :( I guess its not much different with human written code, but the models seem to have weirdly inhuman failures - like, you would just skim some code, cause you just cant believe that anyone can do it wrong, and it turns out to be.

That's what test cases are for, which is good for both humans and nonhumans.

  • Test cases are great, but not a total solution. Can you write a test case for the add_numbers(a, b) function?

    • Well, for some reason it doesnt let me respond to the child comments :(

      The problem (which should be obvious) is that with a/b real you cant construct an exhaustive input/output set. The test case can just prove the presence of a bug, but not its absence.

      Another category of problems that you cant just test and have to prove is concurrency problems.

      And so forth and so on.

    • Of course you can. You can write test cases for anything.

      Even an add_numbers function can have bugs, e.g. you have to ensure the inputs are numbers. Most coding agents would catch this in loosely-typed languages.