Comment by kamranjon

10 hours ago

Consistency, new models don't behave the same on every task as their predecessors. So you end up building pipelines that rely on specific behavior, but now you find that the new model performs worse with regards to a specific task you were performing, or just behaves differently and needs prompt adjustments. They also can fundamentally change the default model settings during new releases, for example Gemini 2.5 models had completely different behavior with regards to temperature settings than previous models. It just creates a moving target that you constantly have to adjust and rework instead of providing a platform that you and by extension your users can rely on. Other providers have much longer deprecation windows, so they must at least understand this frustration.

> Consistency, new models don't behave the same on every task as their predecessors. So you end up building pipelines that rely on specific behavior

If this is a deal breaker, then self-hosting is the only solution. Due to the hardware premium, all models hosted by 3rd-parties will be deprecated to make room for newer, better, and more efficient models.