Comment by yurimo
15 hours ago
I think the telling part is in this line:
> Because the repository is entirely agent-generated, it’s optimized first for Codex’s legibility
I asked a question from a perspective of a human engineer, as in, I will have to read the code and understand, fix it once it breaks. OpenAI approach is opposite, even if it is breaking it is the agent that will be doing the fixing, millions of lines and inelegant designs don't matter because human readability doesn't matter. In any case you use more tokens so you fork over more money.
I will say, however, that IMHO there is objectively bad and good code in terms what it can do and performance, if I can do the same thing in 50 lines as opposed to 1000 lines, this difference still matters for the model. Smaller context usage, better approach that informs downstream generation.
This is the part I think we will see become more relevant.
I created docs-cli (pypi) to manage the index of specs as source code: the framework that goes with it will first create tests for as much as it can, so reproducability becomes the goal, not readability.
https://github.com/ArtRichards/docs-cli
https://artrichards.github.io/agent-playbook-suite/blog/