Microgpt 3 days ago (karpathy.github.io) 3 comments WithinReason Reply Add to library mips_avatar 1 day ago This is so cool. Andrej has been single-handedly making the world of llms more interesting and democratized wrxd 2 days ago This post does a great job at showing that the core idea behind GPT is relatively simple.To do something useful it needs tons of data and then everything starts becoming more and more complex to deal with WithinReason 3 days ago You can even strip out the autodiff and write an explicit backwards pass, training time went down from 40s to 5s for me.
mips_avatar 1 day ago This is so cool. Andrej has been single-handedly making the world of llms more interesting and democratized
wrxd 2 days ago This post does a great job at showing that the core idea behind GPT is relatively simple.To do something useful it needs tons of data and then everything starts becoming more and more complex to deal with
WithinReason 3 days ago You can even strip out the autodiff and write an explicit backwards pass, training time went down from 40s to 5s for me.
This is so cool. Andrej has been single-handedly making the world of llms more interesting and democratized
This post does a great job at showing that the core idea behind GPT is relatively simple.
To do something useful it needs tons of data and then everything starts becoming more and more complex to deal with
You can even strip out the autodiff and write an explicit backwards pass, training time went down from 40s to 5s for me.