Comment by consumer451
1 day ago
"Generate an SVG of a pelican riding a bicycle" is pretty impressive though.
https://old.reddit.com/r/OpenAI/comments/1mettre/gpt5_is_alr...
1 day ago
"Generate an SVG of a pelican riding a bicycle" is pretty impressive though.
https://old.reddit.com/r/OpenAI/comments/1mettre/gpt5_is_alr...
It's a strange looking pelican just overlaid onto a mechanically illiterate version of a bike and the comments are like "the world isn't ready for this".
The comment is
It's pretty clearly making fun of people hyping up new LLM releases.
aka Sam "What have we done?!" Altman.
2 replies →
It's related to the history of Simon Willison[0] having used this as a benchmark on many models.[1]
I believe this model's output is noticeably superior... but yeah, people do tend to get hyperbolic when new stuff happens it their domain of interest.
[0] https://news.ycombinator.com/user?id=simonw
[1] https://www.google.com/search?q=simon+willison+pelican+ridin...
And nowadays a better known benchmark, so data scientists can overfit their models to it even more, even when LLMs are famous for overfitting. So, I wouldn’t trust any results regarding this specific test nowadays.
> I believe this model's output is noticeably superior
Sure, but at the same time Qwen3-30B-A3-2507 is also doing much better than most older models, even the bigger — and more capable — so I don't know how much is due to actual progress and how much is a new version of benchmaxxing.
this comment points out the same things as you. It's (not-so-obvious but pretty clearly in hindsight) sarcasm
I thought it was sarcasm, then got confused because people seemed to take it seriously. So I decided to try the prompt on Gemini 2.5: Pro just says it can't generate an SVG, Flash generates a petty great one. Whatever copilot is using is also good. So I just assume even the image generated is a joke? People are starting to make me doubt my abilities to identify sarcasm.
1 reply →
What is this supposed to be a test of. Actual Image models are unbelievably cracked at correct physics...
Can you do it better?
You know how in the old days, people used to think that the T.Rex prowled the earth in a very upright fashion, with her tail on the ground and head in the air? And in modern times, we believe that this was all wrong. The T.Rex walked with the tail off the ground, essentially level with the head. Right? People point and laugh if you make a drawing of a T.Rex with the tail on the ground and the head in the air.
Well, anyone who has ever been to the ocean and seen a pelican in real life knows that its orientation on the bicycle is completely wrong. In flight, when its weight is supported by its wings, yes, that is probably how it would look. When on the ground, with its weight supported by its feet. NO.
And if you've seen a pelican on the boardwalk interacting with humans or human-made things, you'd believe that a pelican on a bike would have its neck extended vertically, with its head held high. The wings would be on the handlebars.
Speaking of handlebars, both a pelican and a bike are 3-dimensional objects. Pelican beaks are narrow. Much narrower than handlebars. Even hipster fixie handlebars are at least 5x wider than a pelican beak. In a drawing of a pelican riding a bike, the pelican overlays the bike in some spots and the bike overlays the pelican in others.
Anyway, simonw's "pelican on a bike" series is a vector showing progress, but that vector isn't pointing in the right direction.
1 reply →
Yeah
The dumbest among us tend to be the most in awe of mundane technology.
I’d say it’s the opposite. The dumbest don’t have the faculties to appreciate technology. It’s treated as inevitable and immediately becomes another modern fixture we take for granted in our life like a baby using an ipad.
https://chatgpt.com/share/688cd9bd-2dc0-8000-936a-0bbf7ba442...
Compare to what 4o does.
Pelican riding a bicycle is only official when it comes from Simon Willison https://simonwillison.net/2025/Jun/6/six-months-in-llms/
I agree. When he logs in to chatbot interfaces, the random seeds become blessed with authenticity and thus only those outputs are valid.
Yes, I tested the wrong version on accident :(
Heh, I was wondering. Haven't had a moment to set it up in my LibreChat yet. But, I thought I saw reasoning in some of the reddit comments.
The pelican doesn’t look like a pelican and it looked like two images stacked on top of each other.
If GPT 4 couldn’t do that, than GPT 5 isn’t impressive but GPT 4 is underwhelming.
What about images, not SVGs, of clocks that show times different than 10 past 10?
[dead]
it would be interesting if it could use a diffusion model to generate the bitmap then a different model to convert that bitmap to vector format. This could be an interesting way to reason about animations.