Comment by car

4 days ago

Thank you for the follow up! Big fan of your models here, thanks for everything you are doing!

Works fine on MacOS now (chat only).

On Ubuntu 24.04 with two GPU's (3090+3070), it appears that Llama.cpp sometimes uses the CPU and not GPU. This is judging from the tk/s and CPU load for identical models run with US-studio vs. just Llama.cpp (bleeding edge).

0 comments

car

No comments yet

Contribute on Hacker News ↗