Comment by epivosism
1 year ago
Dall-E 3 at least exposes the adjusted prompt. Here's an example of it; you can get it if you hit the API directly and look at revised_prompt.
https://twitter.com/eb_french/status/1760763534127010074
At least they show it to us; and you can prepare or attempt to convince the GPT which interprets your prompt into not doing it quite as much (although the example above is where I failed; it seems like it's on to me, because the violation of what I'm asking for is so egregious.)
Yup, I run an IRC bot with a !dall-e trigger (with protections so people don't run up my OpenAI bill!), and when I get the response back, my bot gives the revised prompt in addition to the image result URL.
Lots of added diversity in the prompts.
Note that I call it added diversity, not forced diversity, because if I ask for a specific race, it will give it to me, and does not override or refuse the requests like Gemini does. If I ask for a crowd of people, I don't mind it changing it to be a racially diverse crowd.
Semi-related note, those revised prompts are also nice because if you create a very non-specific prompt and get something you didn't expect, it gives you insight as to why you got what you got. It added details to your prompt.
Yeah, expanding and making prompts specific via LLMs is a good idea. I would like to be able to see and fight better against the outer prompt in some situations, though. Dalle-3 can be really persistent in certain cases.
The whole alleged theoretical reason for this doesn't work. There is no proposed way to even implement a globally fair representation plan. So it just feels hacky that very USA-21st century-specific grievance groups show up in all global images from the USA to India to Rome to the Mongolian steppe.