Comment by megadragon9

3 hours ago

Interesting project. Do you think manual memory management help understand computational graph lifecycle better, or does it distract from backprop itself?

btw, I went down the micrograd path with numpy-primitives all the way to building a PyTorch clone that can pre-train and post-train LLMs (https://github.com/workofart/ml-by-hand). My learning focus was on the math/calculus <-> high-level APIs, instead of efficiency. I'm glad to see more people tackling this problem from different angles.

ngl, it distracts from backprop itself a little, but teaches a lot about memory management. I did it this way because in parallel I wanted to get better at C, but if your aim is to purely work on ML fundamentals, it’s probably better to do it in python