← Back to context

Comment by growingswe

2 months ago

Great stuff! I wrote an interactive blogpost that walks through the code and visualizes it: https://growingswe.com/blog/microgpt

> By the end of training, the model produces names like "kamon", "karai", "anna", and "anton". None of them are copies from the dataset.

All 4 are in the dataset, btw

  • This is likely because the blog is AI generated and keys off this point from Karpathy: "As a preview, by the end of the script our model will generate (“hallucinate”!) new, plausible-sounding names.", so the LLM just repackaged that into something that is obviously wrong, which is kind of ironic.

This is awesome! Normally I'm pretty critical of LLM-assisted-blogging, but this one's a real winner.

That’s beautifully done, thanks for posting. As helpful again to an ML novice like me as Karpathy’s original.