← Back to context Comment by unleaded 10 days ago ITT nobody remembers gpt2 anymore and that makes me sad 1 comment unleaded Reply GaggiX 10 days ago This model was trained on 6T tokens and has 256k embeddings, quite different than a gpt2 model comparable in size.
GaggiX 10 days ago This model was trained on 6T tokens and has 256k embeddings, quite different than a gpt2 model comparable in size.
This model was trained on 6T tokens and has 256k embeddings, quite different than a gpt2 model comparable in size.