Comment by Aurornis

1 year ago

> If you ask generative AI for a picture of a "nurse", it will produce a picture of a white woman 100% of the time, without some additional prompting or fine tuning that encourages it to do something else.

> If you ask a generative AI for a picture of a "software engineer", it will produce a picture of a white guy 100% of the time, without some additional prompting or fine tuning that encourages it to do something else.

Neither of these statements is true, and you can verify it by prompting any of the major generative AI platforms more than a couple times.

I think your comment is representative of the root problem: The imagined severity of the problem has been exaggerated to such extremes that companies are blindly going to the opposite extreme in order to cancel out what they imagine to be the problem. The result is the kind of absurdity we’re seeing in these generated images.

Note:

> without some additional prompting or fine tuning that encourages it to do something else.

That tuning has been done for all major current models, I think? Certainly, early image generation models _did_ have issues in this direction.

EDIT: If you think about it, it's clear that this is necessary; a model which only ever produces the average/most likely thing based on its training dataset will produce extremely boring and misleading output (and the problem will compound as its output gets fed into other models...).

  • why is it necessary? There's 1.4 billion Chinese. 1.4 billon Indians. 1.2 billion Africans. 0.6 billion Latinos and 1 billion white people. Those numbers don't have to be perfect but nor do they have to be purely white/non-white but taken as is, they show there should be ~5 non-white nurses for every 1 white nurse. Maybe it's less, maybe more, but there's no way "white" should be the default.

    • But that depends on context. If I would ask "please make picture of Nigerian nurse" then the probability should be overwhelmingly black. If I ask for "picture of Finnish nurse" then it should be almost always a white person.

      That probably can be done and may work well already, not sure.

      But the harder problem is that since I'm from a country where at least 99% of nurses are white people, then for me it's really natural to expect a picture of a nurse to be a white person by default.

      But for a person that's from China, a picture of a nurse is probably expected to be of a chinese person!

      But if course the model has no idea who I am.

      So, yeah, this seems like a pretty intractable problem to just DWIM. Then again, the whole AI thingie was an intractable problem three years ago, so...

      1 reply →

    • If the training data was a photo of every nurse in the world, then that’s what you’d expect, yeah. The training set isn’t a photo of every nurse in the world, though; it has a bias.

    • Honest, if controversial, question: beyond virtue signaling what problem is debate around this topic intended to solve? What are we fixing here?

    • If the prompt is in English it should presume an American/British/Canadian/Australian nurse, and represent the diversity of those populations. If the prompt is in Chinese, the nurses should demonstrate the diversity of the Chinese speaking people, with their many ethnicities and subcultures.

      2 replies →

> Neither of these statements is true, and you can verify it by prompting any of the major generative AI platforms more than a couple times.

Platforms that modify prompts to insert modifiers like "an Asian woman" or platforms that use your prompt unmodified? You should be more specific. DALL-E 3 edits prompts, for example, to be more diverse.

> Neither of these statements is true, and you can verify it by prompting any of the major generative AI platforms more than a couple times.

Were the statements true at one point? Have the outputs changed? (Due to either changes in training, algorithm, or guardrails?)

A new problem is not having the versions of the software or the guardrails be transparent.

Try something that may not have guardrails up yet: Try and get an output of a "Jamaican man" that isn't black. Even adding blonde hair, the output will still be a black man.

Edit: similarly, try asking ChatGPT for a "Canadian" and see if you get anything other than a white person.