← Back to context

Comment by GaggiX

1 day ago

You can do it on one single GPU but you would need to use gradient accumulation and the training would probably last 1-2 months on a consumer GPU.

Accepting the 1-2 month estimate at face value we're firmly in hobby territory now. Any adult with a job and a GPU can train their own models for an investement roughly equivelent to a high end gaming machine. Let's run some very hand wavy numbers:

RTX 4090 ($3000) + CPU/Motherboard/SSD/etc ($1600) + two months at full power ($300) is only on the order of $5000 initial investment for the first model. After that you can train 6 models per year to your exact specifications for an extra $150 per month in power usage. This cost will go down.

I'm expecting an explosion of micro-AI models specifically tailored for very narrow use cases. I mean Hugging face already has thousands of models, but they're mostly reusing the aligned big corporate stuff. What's coming is an avalanche of infinately creative micro-AI models, both good and bad. There are no moats.

It's going to be kind of like when teenagers got their hands on personal computers. Oh wait....