← Back to context

Comment by andybak

1 year ago

OK. Have a setting where you can choose either:

1. Attempt to correct inherent biases in training data and produce diverse output (May sometimes produce results that are geographically or historically unrepresentative) 2. Unfiltered (Warning. Will generate output that reflects biases and inequalities in the training data.)

Default to (1) and surely everybody is happy? It's transparent and clear about what and why it's doing. The default is erring on the side of caution but people can't complain if they can switch it off.

> 1. Attempt to correct inherent biases in training data and produce diverse output (May sometimes produce results that are geographically or historically unrepresentative)

The problem that it wasn’t “occasionally” producing unrepresentative images. It was doing it predictably for any historical prompt.

> Default to (1) and surely everybody is happy?

They did default to 1 and, no, almost nobody was happy with the result. It produced a cartoonish vision of diversity where the realities of history and different cultures were forcefully erased and replaced with what often felt like caricatures inserted into out of context scenes. It also had some obvious racial biases in which races it felt necessary to exclude and which races it felt necessary to over-represent.

  • > The problem that it wasn’t “occasionally” producing unrepresentative images. It was doing it predictably for any historical prompt.

    I didn't use the word "occasionally" and I think my phrasing is reasonable accurate. This feels like quibbling in any case. This could be rephrased without affecting the point I am making.

    > They did default to 1 and, no, almost nobody was happy with the result.

    They didn't "default to 1". Your statement doesn't make any sense if there's not an option to turn it off. Making it switchable is the entire point of my suggestion.

(1) is just playing Calvin Ball.

"Correcting" the output to reflect supposedly desired nudges towards some utopian ideal inflates the "value" of the model (and those who promote it) the same as "managing" an economy does by printing money. The model is what the model is and if the result is sufficiently accurate (and without modern Disney reimaginings) for the intended purpose you leave it alone and if it is not then you gather more data and/or do more training.

The issue is that the vast majority of people would prefer 2, and would be fine with Google's reasonable excuse that it it just reflective of the patterns in data on the internet. But the media would prefer 1, and if Google chooses 2 they will have to endure an endless stream of borderline libelous hit pieces coming up with ever more convoluted new exmples of their "racism."

  • "Most" as in 51%? 99%? Can you give any justification for your estimate? How does it change across demographics?

    In any case - I don't think it's an overwhelming majority - especially if you apply some subtlety to how you define "want". What people say they want isn't always the same as what outcomes they would really want if given a omniscient oracle.

    I also think that saying only the "media" wants the alternative is an oversimplification.