Comment by mrandish
5 days ago
A good chunk of that was one-time gains from shifting GPU and memory architectures to better match what LLMs need at scale as well as some algorithmic improvements. Most of the low-hanging architecture optimization has already been harvested. We'll certainly have more algorithmic gains but the consensus is they'll generally be smaller and less frequent.
There's always a chance we'll have some dramatic gains far larger than DeepSeek's optimizations a year ago, but it hasn't happened again yet at even that scale. It would be nice but I certainly wouldn't count on it.
No comments yet
Contribute on Hacker News ↗