Comment by dymk

15 hours ago

Is there? The moment you look closely at the puzzle (which is... the whole point of Where's Waldo), you notice all the deformities and errors.

Yes, it’s not there yet. But nothing unsolvable. First thing that comes to mind would be generating smaller portion at the same resolution, then expand through tiling (although one might need to use another service & model for this), like we used to do with Stable Diffusion years ago.

Another option would be generating these large images, splitting them into grids, and using inpainting on each "tile" to improve the details. Basically the reverse of the first one.

Both significantly increase costs, but for the second one having what Images 2.0 can produce as an input could help significantly improve the overall coherence.