← Back to context

Comment by warwickmcintosh

10 hours ago

ROCm has improved but the reality is you're still fighting the driver stack more than the models. If you're actually doing local inference on AMD you're spending your time on CUDA compatibility layers, not the AI part. Two lines of python is marketing, the gap between demo and working AMD setup is still real.

Ollama works very well in Linux on my AMD hardware. I have a 6800 XT which isn't even originally supported by the ROCm stack in some ways and it "just works" for a ton of very nice models, especially if I seek out quantized versions of the model.