Comment by usef-
18 hours ago
There is a reality that when they control the client it can be significantly cheaper for them to run: the Claude code creator has mentioned that the client was carefully designed to maximise prompt caching. If you use a different client, your usage patterns can be different and it may cost them significantly more to serve you.
This isn't a sudden change, either: they were always up-front that subscriptions are for their own clients/apps, and API is for external clients. They don't document the internal client API/auth (people extracted it).
I think a more valid complaint might be "The API costs too much" if you prefer alternative clients. But all providers are quite short on compute at the moment from what I hear, and they're likely prioritising what they subsidise.
It reminds me of the net neutrality debate from a decade ago. I'm not American but I remember the discord and online hate towards Ajit Pai when they were repealing it.
On one side you had the argument that repealing net neutrality would mean you can save money on your internet bill by only paying for access to what you use. On the other, you had the argument that it would just enable companies to milk you for even more profit and throttle your connection as they see fit.
IMO we need 'net neutrality' for LLM clients. I feel like AI companies are hypocrites for talking about safety all the time, but want us to only use their LLMs in the way they intend. They're saying we're all going to be replaced by AI in 12 months, and we have to use their tools to survive, right?
Yann LeCun recently warned that the AI coming out of China is trending towards being more open than the American alternative. If it continues like this, I can see programmers being pushed towards Chinese models. Is that what the US government wants?
Use of Chinese models: If I had not got a discount for signing up for a full year of Gemini AI Pro for something like $14/month, I might have started just using a Chinese chat model for things where privacy is not an issue. Ironic that I am now paying for both Gemini AI Plus and also $20/month for Ollama Cloud (as a super easy way to experiment with many open models). I am also paying Proton $10/month to use their handy lumo+ private chat service built on Mistral models. I feel like I am spending too much money but I don’t want to feel locked into just a few vendors, and to be honest it is fun having alternatives. A year ago I used APIs for Chinese models (and Mistral in France) and the cost was really low.