← Back to context

Comment by pron

2 hours ago

> A lot of that can be overcome by including the need to be able to put more floors on top as part of the spec

I wish it could, but in practice, today's agents just can't do that. About once a week I reach some architectural bifurcation where one path is stable and the other leads to an inevitable total-loss catastrophe from which the codebase will not recover. The agent's success rate (I mostly use Codex with gpt5.4) is about 50-50. No matter what you explain to them, they just make catastrophic mistakes far too often.