← Back to context Comment by ai-christianson 10 days ago This was trained on 6T tokens. Neat to see so many tokens used for such a small model. 0 comments ai-christianson Reply No comments yet Contribute on Hacker News ↗
No comments yet
Contribute on Hacker News ↗