← Back to context

Comment by fulafel

5 hours ago

So it's not measuring output tokens/s, just how long it takes to start generating tokens. Seems we'll have to wait for independent benchmarks to get useful numbers.

For many workflows involving real time human interaction, such as voice assistant, this is the most important metric. Very few tasks are as sensitive to quality, once a certain response quality threshold has been achieved, as is the software planning and writing tasks that most HN readers are likely familiar.