← Back to context

Comment by simonw

2 days ago

That's a lot less true today than it was six weeks ago. The "reasoning" models are spookily good at answering questions about how code runs, and identifying the source of bugs.

They still make mistakes, and yeah they're still (mostly) next token predicting machines under the hood, but if your mental model is "they can't actually predict through how some code will execute" you may need to update that.