Comment by danuker
2 days ago
> I should be running my own LLM
I approve of this, but in your place I'd wait for hardware to become cheaper when the bubble blows over. I have a i9-10900, and bought an M.2 SSD and 64GB of RAM in july for it, and get useful results with Qwen3-30B-A3B (some 4-bit quant from unsloth running on llama.cpp).
It's much slower than an online service (~5-10 t/s), and lower quality, but it still offers me value for my use cases (many small prototypes and tests).
In the mean time, check out LLM service prices on https://artificialanalysis.ai/ Open source ones are cheap! Lower on the homepage there's a Cost Efficiency section with a Cost vs Intelligence chart.
I have a 9070 XT (16 GB VRAM) and it is fast with deepseek-r1:14B but I didn't know about that Qwen model. Most of the 'better' models will crash for lack of RAM.
https://dev.to/composiodev/qwen-3-vs-deep-seek-r1-evaluation...
If it runs, it looks like I can get a bit more quality. Thanks for the suggestion.