Comment by philkuz
2 years ago
Caveat buried in the abstract is that this beats BERT and non-pretrained Transformers. Looks like GPT style should still be better, but naturally requires a higher computation cost
2 years ago
Caveat buried in the abstract is that this beats BERT and non-pretrained Transformers. Looks like GPT style should still be better, but naturally requires a higher computation cost
Gzip every query with all training data can get more expensive.