Comment by cyanydeez
5 hours ago
the QWEN-3.5-CODER-NEXT fits in half the 128GB and the rest for context. with the right plugins, particularly context pruning, ive got it running over night by writing plans then implementing.
i do not know if theres a smaller model with same capability, but model size and context window at 128 seems like a sweet spot.
token speed really isnt a bother because im either just multitasking or working on the filling in the missing details.
regardless, i think comparing first VRAM sizes w/target model then speed for your cost efficiency. plus, a healthy skepticism of mac hardware costs.
No comments yet
Contribute on Hacker News ↗