← Back to context

Comment by witx

3 days ago

It's by default so you use all those tasty tokens.

Kinda wish there was a deterministic, mostly terse, language to interact with computers

It's called C. With all the undefined behavior it's mostly deterministic!

A lot of users are subsidized (if you're in doubt, consider the wealth of free users).

It's a shotgun approach to answering questions. If it's terse it might only mention 1 of 10 facts it could provide, and that might not be the one you're looking for. So they just say a fuck ton of words and are more likely to meet the needs of everyone asking your question. If they miss it you'll prompt it again and they have to perform a second pass of inference, which costs them more money.

Kinda, more output tokens usually correlates with better benchmark scores. Ideally LLMs would keep that in their thinking section, then draft a response (what they write currently), then output something short. It'd consume even more tokens, but we wouldn't see that text

  • Most modern LLMs (especially frontier ones) are large token hogs because they draft, check, re-draft, the content (whether an output message; or a code diff) sometimes multiple times in the thinking block.

    When you see a thinking summary like "Now writing the function..."; the raw thinking is actually writing the function in its internal thinking. Occasionally, the summariser misses and you get to see the raw text from models like Opus.

    You can also try an open weight LLM like Qwen3.6 and see something that probably resembles the shape of frontier model thinking in some loose way.

If such a language existed, it would surely take a human years of study to become proficient at it.