Comment by danielhanchen
1 year ago
Ye it was indeed very gruelling - but very fun!! I used torch.dist everywhere, read ll implementations side by side to compare them, and had to manually inspect losses, plot them etc. It's a bit hard to automate sadly, since new archs cause new issues.
No comments yet
Contribute on Hacker News ↗