Comment by sitkack
6 months ago
All of the most capable models I use have been clearly trained on the entirety of libgen/z-lib. You know it is the first thing they did, it is like 100TB.
Some of the models are even coy about it.
6 months ago
All of the most capable models I use have been clearly trained on the entirety of libgen/z-lib. You know it is the first thing they did, it is like 100TB.
Some of the models are even coy about it.
The models are not self aware of their training data. They are only aware of what the internet has said about previous models’ training data.
I am not straight up asking them. We know the pithy statement about that word.