Comment by rlupi

20 days ago

I have an M3 Ultra 96 GB, it works reasonably well with something like qwen/qwen3-vl-30b (fast) or openai/gpt-oss-120b (slow-ish) or openai/gpt-oss-20b (fast, largest context). I keep the latter loaded, and have a cronjob that generates a new MOTD for my shell every 15 minutes with information gathered from various sources.