Comment by thot_experiment
5 hours ago
I'm specifically talking about qwen3-30b-a3b, the MoE model (this also applies to the big one). It's very very fast and pretty good, and speed matters when you're replacing basic google searches and text manipulation.
I'm only superficially familiar with these, but curious. Your comment above mentioned the VL model. Isn't that a different model or is there an a3b with vision? Would it be better to have both if I'd like vision or does the vision model have the same abilities as the text models?
This has been my question also: I spend a lot of time experimenting with local models and almost all of my use cases involve text data, but having image processing and understanding would be useful.
How much do I give up (in performance, and running on my 32G M2Pro Mac) using the VL version of a model? For MOE models, hopefully not much.
Looks like it: https://ollama.com/library/qwen3-vl:30b-a3b