Comment by wazoox

7 hours ago

I've been running various models on a Mac Pro 2013 (8 cores, 32 GB RAM) at about 8 to 10 t/s for months. It's not fast, but it's more than enough for many actual tasks, in particular background tasks. An iMac pro will do just as well I suppose.

What are the tasks that do well with 8-10 t/s ?

  • The sort of task you don't expect to end immediately. If extracting data from a bunch of PDFs takes 1 hour or the whole night, that doesn't make much difference to me. It's not fast enough for auto completion and slightly too slow for chat (but bearable IMO).

    • Running a local llm at 10 t/s overnight to extract data from a few PDFs will burn more in electricity than paying cents for the hosted kimi models.

      You can (sometimes) break even if you have a workstation GPU.