Comment by peetle

8 days ago

In my own experience, nano banana still has the tendency to:

- make massive, seemingly random edits to images - adjust image scale - make very fine grained but pervasive detail changes obvious in an image diff

For instance, I have found that nano-banana will sporadically add a (convincing) fireplace to a room or new garage behind a house. This happens even with explicit "ALL CAPS" instructions not to do so. This happens sporadically, even when the temperature is set to zero, and makes it impossible to build a reliable app.

Has anyone had a better experience?

5 comments

peetle

andblac 8 days ago

The "ALL CAPS" part of your comment got me thinking. I imagine most llms understand subtle meanings of upper case text use depending on context. But, as I understand it, ALL CAPS text will tokenize differently than lower case text. Is that right? In that case, won't the upper case be harder to understand and follow for most models since it's less common in datasets?

minimaxir 8 days ago
There's more than enough ALL CAPS text in the corpus of the entire internet, and enough semantic context associated with it for it to be intended to be in the imperative voice.
- miohtama 8 days ago
  
  Shouldn't all caps normalised to tokens like low caps? There are no separate tokens for all caps and low caps in Llama, or at least not in the past.
  
  1 reply →

symisc_devel 8 days ago

I work on the PixLab prompt based photo editor (https://editor.pixlab.io), and it follows exactly what you type with explicit CAPS.