← Back to context

Comment by dabockster

3 hours ago

The title is extremely misleading - you have to rent time on an H100 cluster to get it to work. It is not on-device, and thus not truly $100.

I was really excited, too, until I looked through the readme files and the code.

I feel same. The title looks like I could have on-deivce ChatGPT with $100 forever. I couldn't imagine it's about training the model by myself.

  • Since the resulting model is only ~561M parameters you could run it on a Raspberry Pi that costs less than $100.

What's misleading about that? You rent $100 of time on an H100 to train the model.