← Back to context

Comment by sarchertech

15 hours ago

Agents require tests to keep from spinning out of control when writing more than a few thousand lines, but we know that tests are wildly insufficient to describe the state of the actual code.

You are essentially saying that we should develop other methods of capturing the state of the program to prevent unintended changes.

However there’s no reason to believe that these other systems will be any easier to reason about than the code itself. If we had these other methods of ensuring that observerable behavior doesn’t change and they were substantially easier than reasoning about the code directly, they would be very useful for human developers as well.

The fact that we’ve not developed something like this in 75 years of writing programs, says it’s probably not as easy as you’re making it out.