Comment by chrisjj

12 hours ago

> it breaks my code, tests start to fail and it instantly says “these are all pre existing failures” and moves on like nothing happened

Reminds us of the most important button the "AI" has, over the similarly bad human employee.

'X'

Until, of course, we pass resposibility for that button to an "AI".

The other day Codex on Mac gained the ability to control the UI. Will it close itself if instructed though? Maybe test that and make a benchmark. Closebench.