Comment by theshrike79

6 hours ago

It's just a variant of the wine glass - something that doesn't exist in the source material as-is. I have a few of my own I don't share publicly.

Basically in my niche I _know_ there are no original pictures of specific situations and my prompts test whether the LLM is "creative" enough to combine multiple sources into one that matches my prompt.

I think of if like this: there are three things I want in the picture (more actually, but for the example assume 3). All three are really far from each other in relevance, in the very corner of an equilateral triangle (in the vector space of the LLM's "brain"). What I'm asking it to do is in the middle of all three things.

Every model so far tends to veer towards one or two of the points more than others because it can't figure out how to combine them all into one properly.