Comment by mickael-kerjean
4 days ago
The benchmark is something you can optimize for, doesn't mean it generalize well. Yesterday I tried for 2 hours to get claude to create a program that would extract data from a weird adobe file. 10$ later, the best I had is a program that was doing something like:
switch(testFile) {
case "test1.ase": // run this because it's a particular case
case "test2.ase": // run this because it's a particular case
default: // run something that's not working but that's ok because the previous case should
// give the right output for all the test files ...
}
No comments yet
Contribute on Hacker News ↗