Comment by vunderba

4 months ago

I haven’t gotten around to adding Klein to my GenAI Showdown site yet, but if it’s anything like Z-Image Turbo, it should perform extremely well.

For reference, Z-Image Turbo scored 4 out of 15 points on GenAI Showdown. I’m aware that doesn’t sound like much, but given that one of the largest models, Flux.2 (32b), only managed to outscore ZiT (a 6b model) by a single point and is significantly heavier-weight, that’s still damn impressive.

Local model comparisons only:

https://genai-showdown.specr.net/?models=fd,hd,kd,qi,f2d,zt

6 comments

vunderba

BoredPositron 4 months ago

I think it shows problems with your tests tbh. The bigger models are way more capable than you make them out to be. They are also better in training and understanding of CGI render outputs as reference like normal maps or id-masks. Your testing suite is the perfect example that structured data implies false confidence. Pure t2i is not a good benchmark anymore.

vunderba 4 months ago

Thanks for the feedback.
> The bigger models are way more capable than you make them out to be.
No test suite is ever going to be perfect. GenAI Showdown was started with the goal of focusing on a very narrow spectrum of testing (prompt adherence) because as a creator that's the one of the most interest to me.
> Pure t2i is not a good benchmark anymore
Just FYI Image Editing is already a separate benchmark (see the navbar at the top).
> Your testing suite is the perfect example that structured data implies false confidence
Again - the headline is "Specific prompts and challenges with a strong emphasis placed on adherence". If I tried to capture every possible aspect of GenAI models (multimodal, texture maps, periodic motion, tiling, etc) - I'd be at it until the heat death of the universe.
Incidentally - which model (specifically) do you think is ranked unfairly? While Flux.2 [dev] did only score a single point above ZiT, it's weighted score is much higher (1442 points vs 911 points).

Bombthecat 4 months ago

Can you fix the information bubble on mobile please? When pressing one, it vanishes instantly...

vunderba 4 months ago
Hey Bombthecat, sorry about that! I can't repro this issue on any of the devices I have (Android Pixel 7, an iPad, etc).
If you get a chance, could you list your mobile device specs? That way I can at least try it on Browserstack and see if I can figure out a fix.
- Bombthecat 4 months ago
  
  Samsung, brave browser
  Update: Huh, now it's working
- kennyadam 4 months ago
  
  Yeah works fine for me on a Pixel 9.