← Back to context

Comment by perfmode

14 hours ago

How’s the token throughput / response time?

Healthy!

  prefill: 30.91 t/s, generation: 29.58 t/s

From https://gist.github.com/simonw/31127f9025845c4c9b10c3e0d8612...