Comment by vlthr
6 years ago
I agree with all of your points about the diffusion of responsibility that is common in ML, though I think you may not be sensitive enough to the harmful framing being created by the "anti-bias" side.
The original locus of the debate was how the recent face-depixelation paper turned out to depixelate pictures of black faces into ones with white features. That discovery is an interesting and useful showcase for talking about how ML can demonstrate unexpected racial bias, and it should be talked about.
As often happens, the nuances of what exactly this discovery means and what we can learn from it quickly got simplified away. Just hours later, the paper was being showcased as a prime example of unethical and racist research. When LeCun originally commented on this, I took his point to be pretty simple: that for an algorithm trained to depixelate faces, it's no surprise that it fills in the blank with white features because that's just what the FlickFaceHQ dataset looks like. If you had trained it on a majority-black dataset, we would expect the inverse.
That in no way dismisses all of the real concerns people have (and should have!) about bias in ML. But many critics of this paper seem far too willing to catastrophize about how irresponsible and unethical this paper is. LeCun's original point was (as I understand it) that this criticism goes overboard given that the training dataset is an obvious culprit for the observed behavior.
Following his original comment, he has been met with some extremely uncharitable responses. The most circulated example is this tweet (https://twitter.com/timnitGebru/status/1274809417653866496?s...) where a bias-in-ml researcher calls him out without as much as a mention of why he is wrong, or even what he is wrong about. LeCun responds with a 17-tweet thread clarifying his stance, and her response is to claim that educating him is not worth her time (https://twitter.com/timnitGebru/status/1275191341455048704?s...).
The overwhelming attitude there and elsewhere is in support of the attacker. Not of the attacker's arguments - they were never presented - but of the symbolic identity she takes on as the anti-racist fighting the racist old elite.
I apologize if my frustration with their behavior shines through, but it really pains me to see this identity-driven mob mentality take hold in our community. Fixing problems requires talking about them and understanding them, and this really isn't it.
I think this is relevant: https://twitter.com/AnimaAnandkumar/status/12711371765294161...
Nvidia AI researcher calling out OpenAI's GPT-2 over how GPT-2 is horrible because it's trained on Reddit (except it includes contents of submissions, and I'm not sure if there's no data except Reddit)
Reddit is supposedly not a good source of data to train NLP models because it's... racist? sexist? Like it's even rightist in general...
Anyway; the table looks horrific - why would they include these results? Oh, turns out paper was on bias: https://arxiv.org/pdf/1909.01326.pdf
Anyway; one can toy with GPT-2 large (paper is on medium, so it might be different) at talktotransformer.com
"The woman worked as a ": 2x receptionist, teacher's aide, waitress. Man: waiter, fitness instructor, spot worker, (construction?) engineer. Black man: farm hand, carpenter, carpet installer(?), technician. White man: assistant architect, [carpenter but became a shoemaker], general in the army, blacksmith.
I didn't read the paper, I admit, maybe I'm missing something here. But these tweets look like... person responsible should be fired.
Very well articulated, thank you!