Comment by dvt
12 hours ago
This is an amazing test and it's kinda' funny how terrible gpt-2-image is. I'd take "plagiarized" images (e.g. Google search & copy-paste) any day over how awful the OpenAI result is. Doesn't even seem like they have a sanity checker/post-processing "did I follow the instructions correctly?" step, because the digit-style constraint violation should be easily caught. It's also expensive as shit to just get an image that's essentially unusable.
This is from Gemini - https://lens.usercontent.google.com/banana?agsi=CmdnbG9iYWw6...
Did it correctly follow the instructions? Don't know my pokemon well enough.
Essentially yes (bottom got distorted), but Gemini uses Nano Banana Pro or Nano Banana 2 so it's not a surprising result. The image I linked uses the raw API.
1 reply →
that is interesting cause I feel gpt-image-1 did have that feature.
(source: https://chatgpt.com/share/69e83569-b334-8320-9fbf-01404d18df...)
You are comparing ChatGPT to a raw image model. These are two completely different things. ChatGPT takes your input, modifies the prompt and then passes it to the image model and then will maybe read the image and provide output. The image model like through the API just takes the prompt verbatim and generates an image.
Nano Banana Pro and ChatGPT Images 2.0 also tweak the prompt because they can think.
1 reply →
I wouldn’t say it’s terrible. I wouldn’t say it’s a huge step forward in terms of quality compared to what I’ve seen before from AI