← Back to context

Comment by CurrentB

6 days ago

I agree I'm bullish on AI for coding generally, but I am curious how they'd get around this problem. Even if they can code at super human level, then you just get rarer super human bugs. Or is another AI going to debug it? Unless this loop is basically fail proof, does the human's job just becoming debugging the hardest things to debug (or at least a blindspot of the AI)

This comment reminds me of the old idiom (I cannot remember who is credited with it) that you should be careful not to use your full abilities writing code, because you have to be more clever to debug code than you were to write it.

This type of issue is part of why I've never felt the appeal of LLMs, I want to understand my code because it came from my brain and my understanding, or the same said of a teammate who I can then ask questions when I don't understand something.

I haven't seen enough mention of using these tools to generate formal verification specs for their output, like TLA+. Of course, you're stuck with the same problem of having to verify the specs but you'll always be playing this game and it'd seem like this would be one of best, most reassuring ways to do so.

I'll have the look into this some more but I'm very curious about what the current state of the art is. I'm guessing it's not great because so few people do this in the first place -- because it's so tedious -- and there's probably not nearly enough training data for it to be practical to generate specs for a JavaScript GQL app or whatever these things are best at generating.

> becoming debugging the hardest things to debug

This is my current role, and one of the biggest reasons AI doesn't really help me day to day agent or otherwise.

In my ideal world, AI become so proficient at writing code that they eventually develop their own formally verifiable programming language, purpose built to be verifiable. So that there wouldn't be room for unknown unknowns.