Comment by 9cb14c1ec0
19 hours ago
I don't think so. There are other knobs they can tweak to reduce load that affect quality less than quantizing. Like trimming the conversation length without telling you, reducing reasoning effort, etc.
19 hours ago
I don't think so. There are other knobs they can tweak to reduce load that affect quality less than quantizing. Like trimming the conversation length without telling you, reducing reasoning effort, etc.
We never do anything that reduce model intelligence like that