Comment by jillesvangurp

2 years ago

We have to be a bit more honest about the things we can actually do ourselves. Most people I know would flunk most of the benchmarks we use to evaluate LLMs. Not just a little bit but more like completely and utterly and embarrassingly so. It's not even close; or fair. People are surprisingly alright at a narrow set of problems. Particularly when it doesn't involve knowledge. Most people also suck at reasoning (unless they had years of training), they suck at factual knowledge, they aren't half bad at visual and spatial reasoning, and fairly gullible otherwise.

Anyway, this list looks more like a "hold my beer" moment for AI researchers than any fundamental objections for AIs to stop evolving any further. Sure there are weaknesses, and paths to address those. Anyone claiming that this is the end of the road in terms of progress is going to be in for some disappointing reality check probably a lot sooner than is comfortable.

And of course by narrowing it to just LLMs, the authors have a bit of an escape hatch because they conveniently exclude any further architectures, alternate strategies, improvements, that might otherwise overcome the identified current weaknesses. But that's an artificial constraint that has no real world value; because of course AI researchers are already looking beyond the current state of the art. Why wouldn't they.

3 comments

jillesvangurp

martindbp 2 years ago

It's clear that what's missing is flexibility and agency. For anything that can be put into text or a short conversation, and I'd have to chose between access to ChatGPT or a random human, I know what I'd chose.

pixl97 2 years ago

Agency is one of those things we probably want to think about quite a bit. Especially with the the willingness for people to hook up it up to things that interact with the real world.

shawntan 2 years ago

Not sure what you got out of the paper, but for me it was more spurring ideas about how to fix this in future architectures.

Don't think anyone worth their salt would look at this and think : oh well that's that then.