Comment by kristianp

6 months ago

Open weights models such as GPT-OSS, Kimi K2.x are trained with 4 bit layers. So it wouldn't come as a surprise if the closed models do similar things. If I compare Kimi K2.5 and Opus 4.5 on openrouter, output tokens are about 8x more expensive for Opus, which might indicate Opus is much larger and doesn't quantize, but the claude subscription plans muddy the waters on price comparison a lot.

0 comments