Comment by yjftsjthsd-h
10 hours ago
> I'm aware that 8GB Vram are not enough for most such workloads. But no support at all? That's ridiculous. Let me use the card and fall back to system memory for all I care.
> Nvidia, as much as I hate their usually awfully insufficient linux support, has no such restrictions for any of their modern cards, as far as I'm aware.
In fact, I regularly run llamafile (and sometimes ollama) on an nvidia dGPU in a laptop, with 4GB of VRAM, and it works fine (ish... I mostly do the thing where some layers are on the GPU and some are CPU; it's still faster than pure CPU so whatever).
No comments yet
Contribute on Hacker News ↗