← Back to context

Comment by actsasbuffoon

4 days ago

This misses the point. LLMs will do things like move a knight by a single square as if it were a pawn. Chess is an extremely well understood game, and the rules about how things move is almost certainly well-represented in the training data.

These models cannot even make legal chess moves. That’s incredibly basic logic, and it shows how LLMs are still completely incapable of reasoning or understanding. Many kinds of task are never going to be possible for LLMs unless that changes. Programming is one of those tasks.

>These models cannot even make legal chess moves. That’s incredibly basic logic, and it shows how LLMs are still completely incapable of reasoning or understanding.

Yeah they can. There's a link I shared to prove it which you've conveniently ignored.

LLMs learn by predicting, failing and getting a little better, rinse and repeat. Pre-training is not like reading a book. LLMs trained on chess games play chess just fine. They don't make the silly mistakes you're talking about and they very rarely make illegal moves.

There's gpt-3.5-turbo-instruct which i already shared and plays at around 1800 ELO. Then there's this grandmaster level chess transformer - https://arxiv.org/abs/2402.04494. They're also a couple of models that were trained in the Eleuther AI discord that reached about 1100-1300 Elo.

I don't know what the peak of LLM Chess playing looks like but this is clearly less of a 'LLMs can't do this' problem and more 'Open AI/Anthropic/Google etc don't care if their models can play Chess or not' problem.

So are they capable of reasoning now or would you like to shift the posts ?

  • I think the point here is that if you have to pretrain it for every specific task, it's not artificial general intelligence, by definition.

    • There isn't any general intelligence that isn't receiving pre-traning. People spend 14 to 18+ years in school to have any sort of career.

      You don't have to pretrain it for every little thing but it should come as no surprise that a complex non-trivial game would require it.

      Even if you explained all the rules of chess clearly to someone brand new to it, it will be a while and lots of practice before they internalize it.

      And like I said, LLM pre-training is less like a machine reading text and more like Evolution. If you gave a corpus of chess rules, you're only training a model that knows how to converse about chess rules.

      Do humans require less 'pre-training' ? Sure, but then again, that's on the back of millions of years of evolution. Modern NNs initialize random weights and have relatively very little inductive bias.

      2 replies →

Saying programming is a task that is "never going to be possible" for an LLM is a big claim, given how many people have derived huge value from having LLMs write code for them over the past two years.

(Unless you're arguing against the idea that LLMs are making programmers obsolete, in which case I fully agree with you.)

  • I think "useful as an assistant for coding" and "being able to program" are two different things.

    When I was trying to understand what is happening with hallucination GPT gave me this: > It's called hallucinating when LLMs get things wrong because the model generates content that sounds plausible but is factually incorrect or made-up—similar to how a person might "see" or "experience" things that aren't real during a hallucination.

    From that we can see that they fundamentally don't know what is correct. While they can get better at predicting correct answers, no-one has explained how they are expected to cross the boundary from "sounding plausible" to "knowing they are factually correct". All the attempts so far seem to be about reducing the likelihood of hallucination, not fixing the problem that they fundamentally don't understand what they are saying.

    Until/unless they are able to understand the output enough to verify the truth then there's a knowledge gap that seems dangerous given how much code we are allowing "AI" to write.

    • Code is one of the few applications of LLMs where they DO have a mechanism for verifying if what they produced is correct: they can write code, run that code, look at the output and iterate in a loop until it does what it's supposed to do.