Comment by vunderba
7 days ago
Thanks, that makes sense. I'll have to give the "red bounding box overlay" a shot when there are a great deal of similar objects in the existing image.
I also have a custom pipeline/software that takes in a given prompt, rewrites it using an LLM into multiple variations, sends it to multiple GenAI models, and then uses a VLM to evaluate them for accuracy. It runs in an automated REPL style, so I can be relatively hands-off, though I do have a "max loop limiter" since I'd rather not spend the equivalent of a small country's GDP.
Automated generator-critique loops for evaluation may be really useful for creating your own style libraries, because its easy for an LLM-agent to evaluate how close an image is from a reference style or scene. So you end up with a series of base prompts, and now can replicate that style across a whole franchise of stories. Most people still do it with reference images, and it doesn't really create very stable results. If you do need some help with bounding boxes for nano-banana, feel free to send me a message!