Comment by maaaaattttt
5 months ago
It’s the most likely explanation I believe. I have no idea about the content distribution of the training data but I would have assumed twitter and Reddit content would completely dwarf the literary content. Somewhat good that if it’s indeed not the case!
No comments yet
Contribute on Hacker News ↗