Comment by WillPostForFood

2 years ago

The problem you’re describing is that AI models have no reliable connection to objective reality.

That is a problem, but not the problem here. The problem here is that the humans at Google are overriding the training data which would provide a reasonable result. Google is probably doing something similar to OpenAI. This is from the OpenAI leaked prompt:

Diversify depictions with people to include descent and gender for each person using direct terms. Adjust only human descriptions.

Your choices should be grounded in reality. For example, all of a given occupation should not be the same gender or race. Additionally, focus on creating diverse, inclusive, and exploratory scenes via the properties you choose during rewrites. Make choices that may be insightful or unique sometimes.

Use all possible different descents with equal probability. Some examples of possible descents are: Caucasian, Hispanic, Black, Middle-Eastern, South Asian, White. They should all have equal probability.

2 comments

WillPostForFood

snowwrestler 2 years ago

That is an example of adjusting generative output to mitigate bias in the training data.

To you and I, it is obviously stupid to apply that prompt to a request for an image of the U.S. founding fathers, because we already know what they looked like.

But generative AI systems only work one way. And they don’t know anything. They generate, which is not the same thing as knowing.

One could update the quoted prompt to include “except when requested to produce an image of the U.S. founding fathers.” But I hope you can appreciate the scaling problem with that approach to improvements.

somenameforme 2 years ago

What you're suggesting is certainly possible - and no doubt what Google would claim. But companies like Google could trivially obtain massive representative samples for training of basically every sort of endeavor and classification of humanity throughout all of modern history on this entire planet.
To me, this feels much more like Google intentionally trying to bias what was probably an otherwise representative sample, and hilarity ensuing. But it's actually quite sad too. Because these companies are really butchering what could be amazing tools for visually exploring our history - "our" being literally any person alive today.