← Back to context Comment by bigyabai 4 hours ago You won't be RAM caching much of anything with experts that are 220b parameters worth of layers. 0 comments bigyabai Reply No comments yet Contribute on Hacker News ↗
No comments yet
Contribute on Hacker News ↗