Strangely enough, my first test with Sonnet 4.6 via the API for a relatively simple request was more expensive ($0.11) than my average request to Opus 4.6 (~$0.07), because it used way more tokens than what I would consider necessary for the prompt.
This is an interesting trend with recent models. The smarter ones get away with a lot less thinking tokens, partially to fully negating the speed/price advantage of the smaller models.
Opus is supposed to be the expensive-but-quality one, while Sonnet is the cheaper one.
So if you don't want to pay the significant premium for Opus, it seems like you can just wait a few weeks till Sonnet catches up
Strangely enough, my first test with Sonnet 4.6 via the API for a relatively simple request was more expensive ($0.11) than my average request to Opus 4.6 (~$0.07), because it used way more tokens than what I would consider necessary for the prompt.
This is an interesting trend with recent models. The smarter ones get away with a lot less thinking tokens, partially to fully negating the speed/price advantage of the smaller models.
1 reply →
Okay, thanks. Hard to keep all these names apart.
I'm even surprised people pay more money for some models than others.
Because Opus 4.5 was released like a month ago and state of the art, and now the significantly faster and cheaper version is already comparable.
"Faster" is also a good point. I'm using different models via GitHub copilot and find the better, more accurate models way to slow.
Opus 4.5 was November, but your point stands.
Fair. Feels like a month!
It means price has decreased by 3 times in a few months.
Because Opus 4.5 inference is/was more expensive.