Comment by zozbot234

11 hours ago

Subagent swarms are actually great for the local inference scenario because they can share a whole lot of KV cache. You get to raise the compute intensity of decode (i.e. the aggregate tok/s) essentially for free.

0 comments