Comment by adrian_b

6 hours ago

While most people would not be able to run Kimi K2.6 fast enough for a chat, as a coding assistant the low speed matters much less, especially when many tasks can be batched to progress during a single pass over the weights.

If you run it on your own hardware, you can run it 24/7 without worrying about token price or reaching the subscription limits and it is likely that you can do more work, even on much slower hardware. Customizing an open-source harness can also provide a much greater efficiency than something like Claude Code.

For any serious application, you might be more limited by your ability to review the code, than by hardware speed.

1 comment

adrian_b

zozbot234 4 hours ago

DeepSeek V4 Pro is way more effective at batching multiple tasks together since the KV cache is so much lighter - a max of ~10GB at full 1M context, and in a linear proportion with context according to the DeepSeek V4 release paper. That's extremely impressive, it unlocks batching, agent swarms etc. even on severely memory-constrained platforms, especially at smaller max context.