Comment by simonw

4 months ago

Yeah, if your goal is "build the tightest 8,000 line implementation of training an LLM from scratch, with a focus on both conciseness and educational value" I don't think it's particularly surprising that Claude/Codex weren't much help.

3 comments

simonw

fragmede 4 months ago

Now to wait for Sonnet 5 and GPT-6, and ask them to build that, and see what they come up with.

Tepix 4 months ago
Why would you expect an improvement?
- bjord 4 months ago
  
  because they'll be trained on karpathy's implementation