← Back to context

Comment by Reimersholme

6 years ago

> And that all ignores one of the major issues which Yann entirely skips, but which Timnit covers in some of her work: training on data, even "representative data" encodes the biases that are present in the world today.

Isn't that just too obvious to have to be stated? Any model will learn from the data it's presented, and if we have biased data the model will be biased, just like Yann says in the tweet you link;

"People are biased. Data is biased, in part because people are biased. Algorithms trained on biased data are biased."

If he talked about 'representative data' somewhere as a solution to all problems, then please link to that.