Comment by astrange
2 months ago
No, Anthropic and OpenAI definitely actually believe what they're saying. If you believe companies only care about their shareholders, then you shouldn't believe this about them because they don't even have that corporate structure - they're PBCs.
There doesn't seem to be a reason to believe the rest of this critique either; sure those are potential problems, but what do any of them have to do with whether a system has a transformer model in it? A recording of a human mind would have the same issues.
> It has no way to evaluate if a particular sequence of tokens is likely to be accurate, because it only selects them based on the probability of appearing in a similar sequence, based on the training data.
This in particular is obviously incorrect if you think about it, because the critique is so strong that if it was true, the system wouldn't be able to produce coherent sentences. Because that's actually the same problem as producing true sentences.
(It's also not true because the models are grounded via web search/coding tools.)
> if it was true, the system wouldn't be able to produce coherent sentences. Because that's actually the same problem as producing true sentences
It is...not at all the same? Like they said, you can create perfectly coherent statements that are just wrong. Just look at Elon's ridiculously hamfisted attempts around editing Grok system prompts.
Also, a lot of information on the web is just wrong or out of date, and coding tools only get you so far.
I should've said they're equally hard problems and they're equally emergent.
Why are you just taking it for granted it can write coherent text, which is a miracle, and not believing any other miracles?
"Paris is the capital of France" is a coherent sentence, just like "Paris dates back to Gaelic settlements in 1200 BC", or "France had a population of about 97,24 million in 2024". The coherence of sentences generated by LLMs is "emergent" from the unbelievable amount of data and training, just like the correct factoids ("Paris is the capital of France"). It shows that Artificial Neural Networks using this architecture and training process can learn to fluently use language, which was the goal? Because language is tied to the real world, being able to make true statements about the world is to some degree part of being fluent in a language, which is never just syntax, also semantics.
I get what you mean by "miracle", but your argument revolving around this doesn't seem logical to me, apart from the question: what is the the "other miracle" supposed to be?
Zooming out, this seems to be part of the issue: semantics (concepts and words) neatly map the world, and have emergent properties that help to not just describe, but also sometimes predict or understand the world.
But logic seems to exist outside of language to a degree, being described by it. Just like the physical world.
Humans are able to reason logically, not always correctly, but language allows for peer review and refinement. Humans can observe the physical world. And then put all of this together using language.
But applying logic or being able to observe the physical world doesn't emerge from language. Language seems like an artifact of doing these things and a tool to do them in collaboration, but it only carries logic and knowledge because humans left these traces in "correct language".
2 replies →
Because it's not a miracle? I'm not being difficult here, it's just true. It's neat and fun to play with, and I use it, but in order to use anything well, you have to look critically at the results and not get blinded by the glitter.
Saying "Why can't you be amazed that a horse can do math?" [0] means you'll miss a lot of interesting phenomena.
[0] https://en.wikipedia.org/wiki/Clever_Hans
I can type a query into Google and out pops text. Miracle?
1 reply →