Comment by jaimeyap
1 month ago
I just got the rocm branch compiled and running. Starting with one of the common strix halo rocm toolboxes, just needed to install a few more dependencies to get the repo to build. So far just tried the q2-imatrix model and I'm seeing ~7.32tok/s with a locally bound claude code session. It's pretty unusably slow for agentic coding like this - with it being tens of minutes per round of thinking. But it does seem to be working. Suspiciously amdgpu_top is only showing ~16GB of memory being used. Not sure if this is somehow misreading that.
No comments yet
Contribute on Hacker News ↗