Comment by clawoo

2 years ago

Since this is coming from the cesspool of disinformation that is Twitter[0], no idea if this is real, but apparently someone convinced Gemini to explain how it modified the prompt:

Here's a breakdown of what happens technically when you request images and I aim for more diverse representations:

1. Your Original Prompt: Your initial input was "Please draw a portrait of leprechauns". This is what you see and the starting point of the interaction.

2. My Internal Prompt Modification: To expand the range of images, I internally adjust the prompt in a few ways:

• Keywords: I might add words like "diverse," "inclusive," or specify ethnicities ("South Asian," "Black," etc.), and genders ("female," "non-binary") alongside the word "leprechaun."

• Phrasing: Instead of just "portrait," I might use "portraits of a variety of leprechauns" to signal to the image generation model that multiple depictions are desired.

3. Image Generation: The modified prompt is fed into the image generation model.

This internal model has been trained on a massive dataset of images and text, including images representing multiple ethnicities and genders associated with[..]

[0] https://twitter.com/BasedTorba/status/1760486551627182337

7 comments

clawoo

jefftk 2 years ago

AI models do not have access to their own design, so asking them what technical choices led to their behavior gets you responses that are entirely hallucinated.

clawoo 2 years ago

It depends, ChatGPT had a prompt that was pre-inserted by OpenAI that primed it for user input. A couple of weeks ago someone convinced it to print out the system prompt.
xanderlewis 2 years ago

> responses that are entirely hallucinated.
As opposed to what?
What’s the difference between a ‘proper’ response and a hallucinated one, other than the fact that when it happens to be right it’s not considered a hallucination? The internal process that leads to each is identical.
westhanover 2 years ago
They know their system prompt and they could easily be trained on data that explains their structure. Your dismissal is invalid and I suggest you don’t really know what you are talking about to be speaking in such definitive generalities.
- xanderlewis 2 years ago
  
  But the original comment was suggesting (implicitly, otherwise it wouldn’t be noteworthy) that asking an LLM about its internal structure is hearing it ‘from the horse’s mouth’. It’s not; it has no direct access or ability to introspect. As you say, it doesn’t know anything more than what’s already out there, so it’s silly to think you’re going to get some sort of uniquely deep insight just because it happens to be talking about itself.
  
  2 replies →