Comment by andy12_
3 hours ago
I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.
No comments yet
Contribute on Hacker News ↗