Comment by trash_cat

6 months ago

I use Ollama because I am a casual user and can't be bothered to read the docs on how to setup llama.cpp. I just want to run a simple llm locally.

Why would I care about Vulkan?

9 comments

trash_cat

with vulkan it runs much much faster on consumer hardware, especially opn igpus like intel or amd.

zozbot234 6 months ago

Well, it definitely runs faster on external dGPU's. With iGPU's and possibly future NPU's, the pre-processing/"thinking" phase is much faster (because that one is compute-bound) but text generation tends to be faster on CPU because it makes better use of available memory bandwidth (which is the relevant constraint there). iGPU's and NPU's will still be a win wrt. energy use, however.
bdhcuidbebe 6 months ago
For Intel, OpenVINO should be the preferred route. I dont follow AMD, but Vulkan is just the common denominator here.
- buyucu 6 months ago
  
  If you support Vulkan, you support almost every GPU out there in the consumer market across all hardware vendors. It's an amazing fallback option.
  I agree they should also support OpenVINO, but compared to Vulkan OpenVINO is a tiny market.
  
  4 replies →
sebazzz 6 months ago

How is the performance of Vulkan vs ROCm on AMD iGPUs? Ollama can be persuaded to run on iGPUs with ROCm.