Comment by simonw

6 months ago

The way I run the pelican on a bicycle benchmark is to use this exact prompt:

  Generate an SVG of a pelican riding a bicycle

And execute it via the model's API with all default settings, not via their user-facing interface.

Currently none of the model APIs enable tools unless you ask them to, so this method excludes the use of additional tools.