Comment by vikramkr
6 hours ago
The difference is that the worker you hire would be a human being and not a large matrix multiplication that had parameters optimized by a a gradient descent process and embeds concepts in a higher dimensional vector space that results in all sorts of weird things like subliminal learning (https://alignment.anthropic.com/2025/subliminal-learning/).
It's not a human intelligence - it's a totally different thing, so why would the same test that you use to evaluate human abilities apply here?
Also more directly the "all sorts of other things" we want llms to be good at often involve writing code/spatial reasoning/world understanding which creating an svg of a pelican riding a bicycle very very directly evaluates so it's not even that surprising?
No comments yet
Contribute on Hacker News ↗