Comment by mkagenius

3 months ago

> Nano Banana is still bad at rendering text perfectly/without typos as most image generation models.

I figured that if you write the text in Google docs and share the screenshot with banana it will not make any spelling mistake.

So, use something like "can you write my name on this Wimbledon trophy, both images are attached. Use them" will work.

2 comments

mkagenius

minimaxir 3 months ago

Google's example documentation for Nano Banana does demo that pipeline: https://ai.google.dev/gemini-api/docs/image-generation#pytho...

That's on my list of blog-post-worthy things to test, namely text rendering to image in Python directly and passing both input images to the model for compositing.

mkagenius 3 months ago

Yeah, close.
But it is still generating it with a prompt
> Logo: "A simple, modern logo with the letters 'G' and 'A' in a white circle.
My idea was do to it manually so that there is no probabilities involved.
Though your idea of using python is same.