DeepSeek v4 Flash, various quantised versions of Kimi K2.6, MiniMax 2.7, Qwen 3.5 “full sized, with a dual spark setup you can fit some decent setups on here
My single spark has me running Qwen 3.6 27B and antirez’s specially quantised DeepSeek v4 Flash (which is shockingly impressive)
what models are you using on that? My experiences with apple hardware have convinced me that it is not really good enough for coding locally.
DeepSeek v4 Flash, various quantised versions of Kimi K2.6, MiniMax 2.7, Qwen 3.5 “full sized, with a dual spark setup you can fit some decent setups on here
My single spark has me running Qwen 3.6 27B and antirez’s specially quantised DeepSeek v4 Flash (which is shockingly impressive)
Kimi K2.6 does not run well on 256GB.
2 replies →
It isn’t the models, it’s the closed api and the tooling associated with it. It’s driving me crazy how not-talked-about this is.
You can point both Codex and Claude Code at a local model and they'll work just fine. Codex even explicitly supports that as a feature! [1]
With a nice UI on top, for the desktop app too: [2]
[1]: https://developers.openai.com/codex/config-advanced#custom-m...
[2]: https://docs.ollama.com/integrations/codex-app
As in the coding harnesses?
1 reply →