The sort of task you don't expect to end immediately. If extracting data from a bunch of PDFs takes 1 hour or the whole night, that doesn't make much difference to me.
It's not fast enough for auto completion and slightly too slow for chat (but bearable IMO).
The sort of task you don't expect to end immediately. If extracting data from a bunch of PDFs takes 1 hour or the whole night, that doesn't make much difference to me. It's not fast enough for auto completion and slightly too slow for chat (but bearable IMO).
Running a local llm at 10 t/s overnight to extract data from a few PDFs will burn more in electricity than paying cents for the hosted kimi models.
You can (sometimes) break even if you have a workstation GPU.
Sometimes data privacy is paramount.