← Back to context

Comment by vunderba

9 hours ago

That's because, for the most part, I'm not:

"A comparison of various SOTA generative image models on specific prompts and challenges with a strong emphasis placed on adherence."

Adherence is the more interesting problem, in my opinion, because quality issues can be ameliorated through the use of upscalers, refiner models, LoRAs, and similar tools. Furthermore, there are already a thousand existing benchmarks obsessed with visual fidelity.

I mean there’s a huge difference between a model that throws a black spot on someone’s head and another one that fills it with hair indistinguishable from the real thing. Which is why I’m saying this methodology is only marginally useful.