Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by storus

15 hours ago

4x faster is about token prefill, i.e. the time to first token. It should be on par with DGX Spark there while being slightly faster than M4 for token generation. I.e. when you have long context, you don't need to wait 15 minutes, only 4 minutes.

0 comments

storus

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities