Comment by simonw
1 day ago
This thing's ability to produce entire infographics from a short prompt is really impressive, especially since it can run extra Google searches first.
I tried this prompt:
Infographic explaining how the Datasette open source project works
Here's the result: https://simonwillison.net/2025/Nov/20/nano-banana-pro/#creat...
This is legitimately game changing a feature in my SaaS where customers can generate event flyers. Up until now I had Nano Banana generate just a decorative border and had the actual text be rendered via Pillow controlled by an LLM. The result worked, but didn’t look good.
That said, I wonder if text is only good in small chunks (less than a sentence) or if it can properly render full sentences.
It can render full sentences.
It didn’t do so well at finding middle C on a piano keyboard:
https://gemini.google.com/share/c9af8de05628
I did manage to get one image of a piano keyboard where the black keys were correct, but not consistently.
I've tried similar stuff such as: "Show a piano with an outstretched hand playing a Emaj triad on the E, G#, and B keys".
https://imgur.com/ogPnHcO
Even generating a standard piano with 7 full octaves that are consistent is pretty hard. If you ask it to invert the colors of the naturals and sharps/flats you'll completely break them.
reflection seems slightly wrong as well
Fooled me because it was locally correct!
It even worked really well at creating an infographic for one of my quirkier projects which doesn't have that much information online (other than its repo).
"An infographic explaining how player.html works (from the player.html project on Github). https://github.com/pseudosavant/player.html"
And then it made one formatted for social: "Change it to be an infographic formatted to fit on Instagram as a 1:1 square image."
Is the infographic accurate in terms of the way datasette wprks?
Almost entirely. I called out the one discrepancy in my post:
> “Data Ingestion (Read-Only)” is a bit off.
[flagged]
It’s subtly incorrect. R/w permissions for example are described incorrectly on some nodes.
Then the question becomes, can it incorporate targeted feedback, or is it a oneshot-or-bust affair?
My experience is that ChatGPT is very good at iterating on text (prose, code) but fairly bad at iterating on images. It struggles to integrate small changes, choosing instead to start over from scratch, with wildly different results. Thinking especially here of architectural stuff, where it does a great job laying out furniture in a room, but when I ask it to keep everything the same but change the colour of one piece, it goes completely off the rails.
4 replies →
None of it was accurate.
But boy was it beautiful.
Funny thing to say considering the author of Datasette himself says it's accurate.
I’ve been really excited for you infographic generation. Previous models from Google and openAI had very low detail/resolution for these things.
I’ve found in general that the first generation may not be accurate but a few rolls of the dice and you should have enough to pick a style and format that works, which you can iterate on.
Game changer for architecture diagrams.
I'm finding it bad at instruction following for architectural specs (physical not software), where you tell it what goes where, and it ignores you and does some average-ish thing it's seen before. It looks visually appealing though.
Did you check if the SynthID works when you edit the photos with filters like GrayScale?
It would be great if Google could make SynthID openly available so OpenAI etc could also implement it. Then websites like Facebook, or even local browsers, could implement an "AI warning".
[dead]