Comment by megadragon9
4 hours ago
Interesting project. Do you think manual memory management help understand computational graph lifecycle better, or does it distract from backprop itself?
btw, I went down the micrograd path with numpy-primitives all the way to building a PyTorch clone that can pre-train and post-train LLMs (https://github.com/workofart/ml-by-hand). My learning focus was on the math/calculus <-> high-level APIs, instead of efficiency. I'm glad to see more people tackling this problem from different angles.
ngl, it distracts from backprop itself a little, but teaches a lot about memory management. I did it this way because in parallel I wanted to get better at C, but if your aim is to purely work on ML fundamentals, it’s probably better to do it in python