Comment by minimaxir

6 days ago

Every modern image-generation model can generate a pelican on a bicycle trivially. The point of the test is to generate SVG text that represents an image, which is more complicated.

Yes, there are ways to convert raster images to SVG for use in training data but it's not a good use of anyone's time.

I don't understand this response. Human artists can and do make SVGs.

  • They typically use a visual editor like Inkscape with visual feedback. Nobody is hand-coding a complex SVG.

    • The end result is the same: an SVG file. Definitely doesn't matter to an LLM what produced it.

> Every modern image-generation model can generate a pelican on a bicycle trivially.

Mistral seems to be the exception. Their new model from a few weeks ago is worse then selfhosted gemma.