Comment by drillsteps5

1 day ago

A decent gaming machine perfectly doubles as your friendly local inference server. Just start llama-server with the model of your choosing and start chatting with it through its Web interface or connect any chat completion-compatible client (agentic or not) which will use REST to send requests and receive responses. From any device on your network. Voila.

0 comments