Comment by m3kw9

10 hours ago

Prob a very unscientific way to test an image model. This would me likely because they have the reasoning turned down and let its instant output takeover

There's no good scientific way to test a closed-source model with both nondeterministic and subjective output.

This example image was generated using the API on high, not the low reasoning version. (it is slow and takes 2 minutes lol)

If the results are quantifiable/objective and repeatable it's scientific, how is it not scientific?

The reasoning amount is part of the evaluation isn't it?