← Back to context

Comment by misterbrian

1 day ago

I'm working on inference.club, a distributed inference network for consumer hardware. Sign up with GitHub, get an API key, run an agent on your home network that registers your local inference resources with inference.club, set permissions for who can use your services, try out models in the playground and use the API. So far it supports the following models:

- LLMs (any OpenAI compatible API, vLLM, LM Studio, etc.) - image gen + image edit (flux klein) - text to speech (magpie, dia with voice cloning) - speech to text (OpenAI audio transcriptions + riva compatible) - image to textured 3d model (trellis2) - image+text to video (ltx2.3-gguf) - text to music (acestep)

currently it is just me and Claude vibing. While using Fable 5 moved all of my local inference services to k3s across 3 RTX 4090 PCs and my DGX Spark, now I can just tell Claude/Hermes/etc. to start and stop services.

inference.club is built with Tailscale's tsnet library. It is sort of like an OpenRouter built for different types of local AI models. inference.club also lets you showcase and share generated content. For example here is 90 seconds disco funk track generated by acestep: https://inference.club/s/Vxm6ozW24oBs_JGbPcq7tA

I was inspired by AI Horde, and wanted to see if I could build something that could support all of the model modalities that I use for generating short-form AI slop content on local hardware. This is also similar to Hugging Face Spaces, but running on consumer hardware with a common API. I've been watching the quality of local AI inference making massive improvements in quality and performance, and I want to make it easier for people to try "local AI" even if they don't have a GPU.