Comment by megadragon9
2 months ago
I'm continuing to expand my own deep learning library [1] (PyTorch-clone built with Python and Numpy) to support LLM post-training techniques like supervised fine-tuning (SFT) [2] and reinforcement learning with GRPO [3] . It's a good learning experience to work without all the high-level abstractions to "build a wheel" and "use that wheel to build a car". Post-training results are still cooking, since training on my MacBookPro is quite slow with "unoptimized PyTorch" :)
1. https://github.com/workofart/ml-by-hand
2. https://github.com/workofart/ml-by-hand/blob/main/examples/s...
3. https://github.com/workofart/ml-by-hand/blob/main/examples/g...
No comments yet
Contribute on Hacker News ↗