Comment by jasonjmcghee
13 days ago
I suppose the question is, are they also training a 288B x 128 expert (16T) model?
Llama 4 Colossus when?
13 days ago
I suppose the question is, are they also training a 288B x 128 expert (16T) model?
Llama 4 Colossus when?
No comments yet
Contribute on Hacker News ↗