Comment by UncleEntity
1 day ago
Yeah, they're currently horrible at debugging -- there seems to be blind spots they just can't get past so end up running in circles.
A couple days ago I was looking for something to do so gave Claude a paper ("A parsing machine for PEGs") to ask it some questions and instead of answering me it spit out an almost complete implementation. Intrigued, I threw a couple more papers at it ("A Simple Graph-Based Intermediate Representation" && "A Text Pattern-Matching Tool based on Parsing Expression Grammars") where it fleshed out the implementation and, well... color me impressed.
Now, the struggle begins as the thing has to be debugged. With the help of both Claude and Deepseek we got it compiling and passing 2 out of 3 tests which is where they both got stuck. Round and round we go until I, the human who's supposed to be doing no work, figured out that Claude hard coded some values (instead of coding a general solution for all input) which they both missed. In applying ever more and more complicated solutions (to a well solved problem in compiler design) Claude finally broke all debugging output and I don't understand the algorithms enough to go in and debug it myself.
Of course I didn't use any sort of source code management so I could revert to a previous version before it was broken beyond all fixing...
Honestly, I don't even consider this a failure. I learned a lot more on what they are capable of and now know that you have to give them problems in smaller sections where they don't have to figure out the complexities of how a few different algorithms interact with each other. With this new knowledge in hand I started on what I originally intended to do before I got distracted with Claude's code solution to a simple question.
--edit--
Oh, the irony...
After typing this out and making an espresso I figured out the problem Claude and Deepseek couldn't see. So much for the "superior" intelligence.
No comments yet
Contribute on Hacker News ↗