← Back to context

Comment by car

4 days ago

Thank you for the follow up! Big fan of your models here, thanks for everything you are doing!

Works fine on MacOS now (chat only).

On Ubuntu 24.04 with two GPU's (3090+3070), it appears that Llama.cpp sometimes uses the CPU and not GPU. This is judging from the tk/s and CPU load for identical models run with US-studio vs. just Llama.cpp (bleeding edge).