But you don't have to be restricted to one model either? Codex being open source means you can choose to use Claude models, or Gemini, or...
It's fair enough to decide you want to just stick with a single provider for both the tool and the models, but surely still better to have an easy change possible even if not expecting to use it.
Codex CLI with Opus, or Gemini CLI with 5.2-codex, because they're open sourced agents? Go ahead if you want but show me where it actually happens with practical values
This is a fun thought experiment. I believe that we are now at the $5 Uber (2014) phase of LLMs. Where will it go from here?
How much will a synthetic mid-level dev (Opus 4.5) cost in 2028, after the VC subsidies are gone? I would imagine as much as possible? Dynamic pricing?
Will the SOTA model labs even sell API keys to anyone other than partners/whales? Why even that? They are the personalized app devs and hosts!
Man, this is the golden age of building. Not everyone can do it yet, and every project you can imagine is greatly subsidized. How long will that last?
While I remember $5 Ubers fondly, I think this situation is significantly more complex:
- Models will get cheaper, maybe way cheaper
- Model harnesses will get more complex, maybe way more complex
- Local models may become competitive
- Capital-backed access to more tokens may become absurdly advantaged, or not
The only thing I think you can count on is that more money buys more tokens, so the more money you have, the more power you will have ... as always.
But whether some version of the current subsidy, which levels the playing field, will persist seems really hard to model.
All I can say is, the bad scenarios I can imagine are pretty bad indeed—much worse than that it's now cheaper for me to own a car, while it wasn't 10 years ago.
I can run Minimax-m2.1 on my m4 MacBook Pro at ~26 tokens/second. It’s not opus, but it can definitely do useful work when kept on a tight leash. If models improve at anything like the rate we have seen over the last 2 years I would imagine something as good as opus 4.5 will run on similarly specced new hardware by then.
But you don't have to be restricted to one model either? Codex being open source means you can choose to use Claude models, or Gemini, or...
It's fair enough to decide you want to just stick with a single provider for both the tool and the models, but surely still better to have an easy change possible even if not expecting to use it.
Codex CLI with Opus, or Gemini CLI with 5.2-codex, because they're open sourced agents? Go ahead if you want but show me where it actually happens with practical values
until Microsoft buys it and enshits it.
This is a fun thought experiment. I believe that we are now at the $5 Uber (2014) phase of LLMs. Where will it go from here?
How much will a synthetic mid-level dev (Opus 4.5) cost in 2028, after the VC subsidies are gone? I would imagine as much as possible? Dynamic pricing?
Will the SOTA model labs even sell API keys to anyone other than partners/whales? Why even that? They are the personalized app devs and hosts!
Man, this is the golden age of building. Not everyone can do it yet, and every project you can imagine is greatly subsidized. How long will that last?
While I remember $5 Ubers fondly, I think this situation is significantly more complex:
- Models will get cheaper, maybe way cheaper
- Model harnesses will get more complex, maybe way more complex
- Local models may become competitive
- Capital-backed access to more tokens may become absurdly advantaged, or not
The only thing I think you can count on is that more money buys more tokens, so the more money you have, the more power you will have ... as always.
But whether some version of the current subsidy, which levels the playing field, will persist seems really hard to model.
All I can say is, the bad scenarios I can imagine are pretty bad indeed—much worse than that it's now cheaper for me to own a car, while it wasn't 10 years ago.
1 reply →
The real question is how long it'll take for Z.ai to clone it at 80% quality and offer it at cost. The answer appears to be "like 3 months".
2 replies →
I can run Minimax-m2.1 on my m4 MacBook Pro at ~26 tokens/second. It’s not opus, but it can definitely do useful work when kept on a tight leash. If models improve at anything like the rate we have seen over the last 2 years I would imagine something as good as opus 4.5 will run on similarly specced new hardware by then.
4 replies →