Comment by megadragon9

2 months ago

I'm continuing to expand my own deep learning library [1] (PyTorch-clone built with Python and Numpy) to support LLM post-training techniques like supervised fine-tuning (SFT) [2] and reinforcement learning with GRPO [3] . It's a good learning experience to work without all the high-level abstractions to "build a wheel" and "use that wheel to build a car". Post-training results are still cooking, since training on my MacBookPro is quite slow with "unoptimized PyTorch" :)

1. https://github.com/workofart/ml-by-hand

2. https://github.com/workofart/ml-by-hand/blob/main/examples/s...

3. https://github.com/workofart/ml-by-hand/blob/main/examples/g...

0 comments

megadragon9

No comments yet

Contribute on Hacker News ↗