← Back to context

Comment by peetle

8 days ago

In my own experience, nano banana still has the tendency to:

- make massive, seemingly random edits to images - adjust image scale - make very fine grained but pervasive detail changes obvious in an image diff

For instance, I have found that nano-banana will sporadically add a (convincing) fireplace to a room or new garage behind a house. This happens even with explicit "ALL CAPS" instructions not to do so. This happens sporadically, even when the temperature is set to zero, and makes it impossible to build a reliable app.

Has anyone had a better experience?

The "ALL CAPS" part of your comment got me thinking. I imagine most llms understand subtle meanings of upper case text use depending on context. But, as I understand it, ALL CAPS text will tokenize differently than lower case text. Is that right? In that case, won't the upper case be harder to understand and follow for most models since it's less common in datasets?

  • There's more than enough ALL CAPS text in the corpus of the entire internet, and enough semantic context associated with it for it to be intended to be in the imperative voice.

    • Shouldn't all caps normalised to tokens like low caps? There are no separate tokens for all caps and low caps in Llama, or at least not in the past.

      1 reply →