Comment by simonw
17 days ago
I upgraded my llm-gemini plugin to handle this, and shared the results of my "Generate an SVG of a pelican riding a bicycle" benchmark here: https://simonwillison.net/2025/Feb/5/gemini-2/
The pricing is interesting: Gemini 2.0 Flash-Lite is 7.5c/million input tokens and 30c/million output tokens - half the price of OpenAI's GPT-4o mini (15c/60c).
Gemini 2.0 Flash isn't much more: 10c/million for text/image input, 70c/million for audio input, 40c/million for output. Again, cheaper than GPT-4o mini.
The only benchmark worth paying attention to.
Is there a way to see/compare the shared results for all of the LLMs you've tested this prompt on in one place? The 2.0 pro result seems decent but I don't have a baseline if that's because it is or if the other 2 are just "extremely bad" or something.
Search by tag: https://simonwillison.net/tags/pelican-riding-a-bicycle/
Not a bad pelican from 2.0 Pro! The singularity is almost upon us :)
The SVGs are starting to look actually recognisable! You'll need a new benchmark soon :)