Comment by pdimitar
3 days ago
Thank you for the validation. As much as I don't like NVIDIA's shenanigans on Linux, having a local LLM is very tempting and I might put my ideological problems to rest over it.
Though I have to ask: why two eGPUs? Is the LLM software smart enough to be able to use any combination of GPUs you point it at?
Yes, Ollama is very plug-and-play when it comes to multi GPU.
llama.cpp probably is too, but I haven't tried it with a bigger model yet.
Even today some progress was released on parallelizing WAN video generation over multiple GPUs. LLMs are way easier to split up.