← Back to context

Comment by thomasfl

9 hours ago

Is there some documentation for this? The code is probably the simplest (Not So) Large Language Model implementation possible, but it is not straight forward to understand for developers not familiar with multi-head attention, ReLU FFN, LayerNorm and learned positional embeddings.

This projects shares similarities with Minix. Minix is still used at universities as an educational tool for teaching operating system design. Minix is the operating system that taught Linus Torvalds how to design (monolithic) operating systems. Similarly having students adding capabilities to GuppyLM is a good way to learn LLM design.

give the code to an LLM and have a discussion about it.

  • does this work? there is no more need for writing high level docs?

    • > does this work?

      Absolutely. If you loaded this into an agentic coding harness with a decent model, I can practically guarantee it would be able to help you figure out what's going on.

      > there is no more need for writing high level docs?

      Absolutely not. That would be like exploring a cave without a flashlight, knowing that you could just feel your way around in the dark instead.

      Code is not always self-documenting, and can often tell you how it was written, but not why.

      3 replies →

    • There are so many blogs and tutorials about this stuff in particular, I wouldn't worry about it being outside the training data distribution for modern LLMs. If you have a scarce topic in some obscure language I'd be more careful when learning from LLMs.

    • LLMs can tell you what the code does but not why the developer chose to do it that way.

      Also, large codebases are harder to understand. But projects like these are simple to discuss with an LLM.

      2 replies →