Comment by afro88
5 hours ago
I have a real world problem I gave o1 when it came out and it got it quite wrong. It's a scheduling problem with 4 different constraints that vary each day, and success criteria that need to be fulfilled over the whole week.
GPT-5 Thinking (Think Longer) and Opus 4.1 Extended Thinking both get it right.
Maybe this unique problem is somehow a part of synthetic training data? Or maybe it's not and the paper is wrong? Either way, we have models that are much more capable at solving unique problems today.
No comments yet
Contribute on Hacker News ↗