← Back to context

Comment by bboozzoo

3 hours ago

Seven tokens long input isn't very realistic, is it? For coding tasks it's normal for the input to be thousands or 10s of thousands. If it wasn't for prefix caching it'd be one miserable experience, but even then at the very best the input is often in hundreds each time. And don't even try to dump some logs into the prompt.

> Seven tokens long input isn't very realistic, is it?

The test prompt above was "Why is the sky blue?", so there's the seven tokens. I meant to highlight that because I'd expect processing of a thousand-token input to be faster per token than presented.