Comment by Aurornis
4 hours ago
Cool visualization, but most of the token generation in my sessions doesn't go to output code or even the text I see. Reasoning tokens make up most of the output. That can only occur after processing the input files and context.
For non-trivial work I go through hundreds of thousands of tokens (combined prefill + tg of course) before even getting to some useful text output.
I mostly use LLMs for exploration and studies, rarely code generation. Prefill matters heavily for this. Even in the high hundreds or low thousands prefill rate I spend a lot of time waiting on the LLM (doing other things, not twiddling thumbs)
No comments yet
Contribute on Hacker News ↗