← Back to context

Comment by simonw

18 hours ago

I just got a much better version using this command instead, which uses the maximum image size according to https://github.com/openai/openai-cookbook/blob/main/examples...

  OPENAI_API_KEY="$(llm keys get openai)" \
    uv run 'https://raw.githubusercontent.com/simonw/tools/refs/heads/main/python/openai_image.py' \
    -m gpt-image-2 \
    "Do a where's Waldo style image but it's where is the raccoon holding a ham radio" \
    --quality high --size 3840x2160

https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a... - I found the raccoon!

I think that image cost 40 cents.

Fed into a clear Claude Code max effort session with : "Inspect waldo2.png, and give me the pixel location of a raccoon holding a ham radio.". It sliced the image into small sections and gave:

"Found the raccoon holding a ham radio in waldo2.png (3840×2160).

  - Raccoon center: roughly (460, 1680)                                                                                            
  - Ham radio (walkie-talkie) center: roughly (505, 1650) — antenna tip around (510, 1585)                                         
  - Bounding box (raccoon + radio): approx x: 370–540, y: 1550–1780                                                                
                                                                                                                                   
  It's in the lower-left area of the image, just right of the red-and-white striped souvenir umbrella, wearing a green vest. "

Which is correct!

  • We would need a larger sample size than just myself, but the raccoon was in the very first spot I looked. Found it literally immediately, as if that's where my eyes naturally gravitated to first. Hopefully that's just luck and not an indictment of the image-creating ability, as if there is some element missing from this "Where's Waldo" image, that would normally make Waldo hard to find.

    • There seemed to be more space around the raccoon than most other subjects. Zoomed out it appears as almost a “halo” highlighting the raccoon.

Funny how it can look convincing from far away but once you zoom in you find out most characters have a mix of leprosy and skin cancer.

A startling number of people either have no arms, one arm, a half of an arm, or a shrunken arm; how odd!

  • There id a leg that sprouts into part of bush, perhaps that's where people's legs are disappearing to.

  • Finding the raccoon was instant. Finding all the weird AI artifacts is more fun. It's quite fascinating really. As usual it looks impressive at a glance but completely falls apart on closer inspection. I also didn't find any jokes, unless maybe the bridge to nowhere or finger posts pointing both ways counts?

The faces...that's nice that it turned a kid's book into an abomination

  • By image generation standards this is a ridiculously good result. No surprise that people instantly find the new limits, but they are new limits.

  • It's interesting that the raccoon is well defined because it was a part of the request. But none of the other Fauna are.

Can it generate non halloween version though?

This lower-is-better danse macabre, nightmares inducing ratio feels like interesting proxy for models capability.

>I think that image cost 40 cents.

Kinda made me sad assuming the author didn't license anything to OpenAI.

I recognize it could revert (99% of?) progress if all the labs moved to consent-based training sets exclusively, but I can't think of any other fair way.

$.40 does not represent the appropriate value to me considering the desirability of the IP and its earning potential in print and elsewhere. If the world has to wait until it’s fair, what of value will be lost? (I suppose this is where the big wrinkle of foreign open weight models comes in.)

  • License what? The concept of a hidden object search? The only stylistic similarity here is the viewing angle. Where’s Waldo comics are flat, brightly colored line drawings that look nothing like this at all.

    • Well, I recognized the style from even the new physical books on sale today, but I don’t know art well enough to use a term like flat.

      I am not an art expert but I’m perhaps a reasonable consumer and there is possibility of confusion if someone sells AI Where’s Waldo knockoff books at the dollar store, maybe until I take a closer look.