Comment by singularfutur
8 days ago
Every release they claim it writes production code but my team still spends hours fixing subtle bugs the model introduces. The demos are cherry picked and the real world failure rate is way higher than anyone admits. Meanwhile we keep feeding them our codebases for free training data.
How would that compare to subtle bugs introduced by developers? I have seen a massive amount of bugs during my career, many of those introduced by me.
it compares... unfavorably, on the side of ai
Not from what I'm seeing it. 5.3 codex xhigh is pretty amazing.