← Back to context

Comment by SXX

5 hours ago

I think your demo need more realistic thinking logs because thinking usually burns at least 2x to 3x of tokens of the code and for harder tasks much more.

Indeed, at 30tok/s make it pause for 20 seconds while "thinking" is streaming (and hidden); that's the real experience.