Comment by iforgotpassword

5 months ago

What I don't get is why they don't at least assign a dev or two to make the poster child of this work: llama.cpp

It's the first thing anyone tries when trying to dabble in AI or compute on the gpu, yet it's a clusterfuck to get to work. A few blessed cards work, with proper drivers and kernel; others just crash, perform horribly slow, or output GGGGGGGGGGGGGG to every input (I'm not making this up!) Then you LOL, dump it and go buy nvidia et voila, stuff works first try.

7 comments

iforgotpassword

wkat4242 5 months ago

It does work, I have it running on my Radeon VII Pro

Filligree 5 months ago
It sometimes works.
- wkat4242 5 months ago
  
  How so? It's rock solid for me. I use ollama but it's based on llama.cpp
  It's quite fast also, probably because that card has fast HBM2 memory (it has the same memory bandwidth as a 4090). And it was really cheap as it was on deep sale as an outgoing model.
  
  4 replies →