Comment by jasonjmcghee
8 months ago
I suppose the question is, are they also training a 288B x 128 expert (16T) model?
Llama 4 Colossus when?
8 months ago
I suppose the question is, are they also training a 288B x 128 expert (16T) model?
Llama 4 Colossus when?
No comments yet
Contribute on Hacker News ↗