Comment by Snuggly73

2 months ago

Yes, and for some cases no.

The models are gotten very good, but I rather have an obviously broken pile of crap that I can spot immediately, than something that is deep fried with RL to always succeed, but has subtle problems that someone will lgtm :( I guess its not much different with human written code, but the models seem to have weirdly inhuman failures - like, you would just skim some code, cause you just cant believe that anyone can do it wrong, and it turns out to be.

4 comments

Snuggly73

minimaxir 2 months ago

That's what test cases are for, which is good for both humans and nonhumans.

Snuggly73 2 months ago
Test cases are great, but not a total solution. Can you write a test case for the add_numbers(a, b) function?
- Snuggly73 2 months ago
  
  Well, for some reason it doesnt let me respond to the child comments :(
  The problem (which should be obvious) is that with a/b real you cant construct an exhaustive input/output set. The test case can just prove the presence of a bug, but not its absence.
  Another category of problems that you cant just test and have to prove is concurrency problems.
  And so forth and so on.
- minimaxir 2 months ago
  
  Of course you can. You can write test cases for anything.
  Even an add_numbers function can have bugs, e.g. you have to ensure the inputs are numbers. Most coding agents would catch this in loosely-typed languages.