Comment by megaloblasto
2 days ago
It's a strange looking pelican just overlaid onto a mechanically illiterate version of a bike and the comments are like "the world isn't ready for this".
2 days ago
It's a strange looking pelican just overlaid onto a mechanically illiterate version of a bike and the comments are like "the world isn't ready for this".
The comment is
It's pretty clearly making fun of people hyping up new LLM releases.
aka Sam "What have we done?!" Altman.
It's morbidly impressive how much Sam Altman is a sociopath's sociopath, knowing the right things to say to ensnare his fellow sociopaths into his trap.
1 reply →
It's related to the history of Simon Willison[0] having used this as a benchmark on many models.[1]
I believe this model's output is noticeably superior... but yeah, people do tend to get hyperbolic when new stuff happens it their domain of interest.
[0] https://news.ycombinator.com/user?id=simonw
[1] https://www.google.com/search?q=simon+willison+pelican+ridin...
And nowadays a better known benchmark, so data scientists can overfit their models to it even more, even when LLMs are famous for overfitting. So, I wouldn’t trust any results regarding this specific test nowadays.
> I believe this model's output is noticeably superior
Sure, but at the same time Qwen3-30B-A3-2507 is also doing much better than most older models, even the bigger — and more capable — so I don't know how much is due to actual progress and how much is a new version of benchmaxxing.
this comment points out the same things as you. It's (not-so-obvious but pretty clearly in hindsight) sarcasm
I thought it was sarcasm, then got confused because people seemed to take it seriously. So I decided to try the prompt on Gemini 2.5: Pro just says it can't generate an SVG, Flash generates a petty great one. Whatever copilot is using is also good. So I just assume even the image generated is a joke? People are starting to make me doubt my abilities to identify sarcasm.
I believe the user who posted the image also included the api call snippet in another comment, so I took it as genuine. However, I feel your pain.
What is this supposed to be a test of. Actual Image models are unbelievably cracked at correct physics...
Can you do it better?
You know how in the old days, people used to think that the T.Rex prowled the earth in a very upright fashion, with her tail on the ground and head in the air? And in modern times, we believe that this was all wrong. The T.Rex walked with the tail off the ground, essentially level with the head. Right? People point and laugh if you make a drawing of a T.Rex with the tail on the ground and the head in the air.
Well, anyone who has ever been to the ocean and seen a pelican in real life knows that its orientation on the bicycle is completely wrong. In flight, when its weight is supported by its wings, yes, that is probably how it would look. When on the ground, with its weight supported by its feet. NO.
And if you've seen a pelican on the boardwalk interacting with humans or human-made things, you'd believe that a pelican on a bike would have its neck extended vertically, with its head held high. The wings would be on the handlebars.
Speaking of handlebars, both a pelican and a bike are 3-dimensional objects. Pelican beaks are narrow. Much narrower than handlebars. Even hipster fixie handlebars are at least 5x wider than a pelican beak. In a drawing of a pelican riding a bike, the pelican overlays the bike in some spots and the bike overlays the pelican in others.
Anyway, simonw's "pelican on a bike" series is a vector showing progress, but that vector isn't pointing in the right direction.
This comment made me crave a human "Pelican on a bike" competition.
Yeah
The dumbest among us tend to be the most in awe of mundane technology.
I’d say it’s the opposite. The dumbest don’t have the faculties to appreciate technology. It’s treated as inevitable and immediately becomes another modern fixture we take for granted in our life like a baby using an ipad.
https://chatgpt.com/share/688cd9bd-2dc0-8000-936a-0bbf7ba442...
Compare to what 4o does.