Comment by yodon
7 days ago
The real question is "could she have written that problem statement a year ago, or is a year's learning by the team baked into that prompt?"
There are lots of distributed agent orchestators out there. Rarely is any of them going to do exactly what you want. The hard part is commonly defining the problem to be solved.
There’s definitely a strong Clyde-the-counting-horse effect. language models perform vastly better on questions to which the question writer knew the answer.
Kluger/Clever Hans? [1]
The funny thing is: the abilities that people thought of as clever at the time (doing maths) can be done with a $1 component these days. The thing Clever Hans actually did (reading the audience) is something we spend billions on, and in some circles it's still up for debate whether anything can do it.
see also: Moravec's Paradox [2]
[1] https://en.wikipedia.org/wiki/Clever_Hans
[2] https://en.wikipedia.org/wiki/Moravec%27s_paradox
Even the Google engineer points out that this isn't an engineering problem, but an alignment problem among the Google devs themselves.