Comment by Tepix

15 hours ago

Kimi K2.6 does not run well on 256GB.

2 comments

Tepix

Have you tried it? It would be slow for sure, but the main limitation AIUI would actually be storing the context in RAM - models like Kimi and GLM have high demands there which limit your ability to get meaningful aggregate throughput via large batches.

girvo 12 hours ago

True, I might be thinking of some of the communities four-Spark clusters for it; it’s already int4 right?