Comment by verma7

2 months ago

2x the number of lines of code (~400L), 10x the speed

The hard part was figuring out how to represent the Value class in C++ (ended up using shared_ptrs).

9 comments

verma7

I made an explicit reverse pass (no autodiff), it was 8x faster in Python

hu3 2 months ago
I made an explicit double-reverse pass (no code!), it was 80x faster in my head!
- spopejoy 2 months ago
  
  "I've got an ipod -- In My Mind"
  https://theonion.com/i-have-an-ipod-in-my-mind-1819584018/
- WithinReason 2 months ago
  
  code here, it's just not interesting to look at:
  https://news.ycombinator.com/item?id=47220542
bear3r 2 months ago

tradeoff worth naming: you avoid the autodiff graph overhead (hence the speedup), but any architecture change means rewriting every gradient by hand. fine for a pedagogical project, but that's exactly why autodiff exists.
love2read 2 months ago
Can you share a link?
- WithinReason 2 months ago
  
  https://www.ideone.com/VAz4Nn
  Doesn't run inside IDEone due to the external download link, but you can copy&paste the code over

24x speedup (over 10x already) and similar loss profile (for c++ version, optimized by claude): https://gist.github.com/freakynit/3982eab8413a89941bd0018e63......

verma7 1 month ago

This is amazing! Thanks for optimizing the code using Claude!