Comment by xp84

2 months ago

For the verification experts: (and forgive me because I have almost zero of the math understanding of this stuff)

> This makes formal verification a prime target for AI-assisted programming. Given that we have a formal specification, we can just let the machine wander around for hours, days, even weeks.

Is this sentiment completely discounting that there can be many possible ways to write program that satisfies certain requirements that all have correct outputs? Won’t many of these be terrible in terms of performance, time complexity, etc? I know that in the most trivial case, AI doesn’t jump straight to O(n)^3 solutions or anything, but also there’s no guarantee it won’t have bugs that degrade performance as long as they don’t interfere with technical correctness.

Also, are we also pretending that having Claude spin for “even weeks” is free?

2 comments

xp84

hun3 2 months ago

Verified software should satisfy the liveness property; otherwise, an infinite loop that never returns would pass as "correct."

Verifying realtime software goes even further and enforces an upper bound on the maximum number of ticks it takes to complete the algorithm in all cases.

gaogao 2 months ago

Yup, I've already spent like $20k using Claude to verify things, so like there's probably some room for cost cutting.