Comment by CamperBob2

10 months ago

it cannot ever get to an AGI level that you'd assume to be competitive to a human, even most animals.

Suppose you turn out to be wrong. What would convince you?

9 comments

CamperBob2

ChuckMcM 10 months ago

It could diagram a sentence it had never seen.

karolist 10 months ago

It does that all the time even now. Play with temp, top_k, min_p and nothing is preventing you from getting combinations and order of tokens previously not encountered in history.
CamperBob2 10 months ago

A language model can generate valid sentences that have never existed before -- not just syntactically or statistically valid sentences in a Markovian sense, but semantically valid ones. How could it possibly do that without an implicit awareness of all aspects of language, including grammar and parts of speech?
That implicit awareness is indistinguishable from knowledge, at least to the extent it yields correct results.
And the fact that it was inferred from examples rather than taught explicitly is indistinguishable from intelligence.

kragen 10 months ago

I did eventually get an LLM to produce what seems to be a correct diagram of a sentence it had never seen, but it took about ten tries. Grammatical analysis seems to have happened correctly every time, but getting to a usable diagram was difficult.

(I know that it's generally rude to include LLM output in HN comments, but in this case I think it's essential supporting material to elevate the discussion of LLM capabilities above "yes it is", "no it isn't".)

First, I just asked ChatGPT (GPT-4o mini, what they offer anonymous free users):

> Please diagram the sentence, "If my mother's road had been cultural, my uncle's sloth squad residue would have compiled a periodic meadow of night."

I just invented that sentence, picking several words with a random number generator, but of course I can't prove to you that I didn't copy it from some book you haven't seen. (For that, you'd have to invent a sentence yourself, or at least provide me with some words to include in it.) I hope you can at least agree that it's a highly unlikely sentence for anyone ever to have uttered before!

After talking to itself for a few paragraphs, it arrived at the following grammatical analysis:

  Step 4: Diagramming the sentence

    Main Clause:
        Subject: "my uncle's sloth squad residue"
        Verb: "would have compiled"
        Direct Object: "a periodic meadow of night"
            "a periodic meadow" (noun phrase)
            "of night" (prepositional phrase modifying "meadow")

    Conditional Clause:
        Subject: "my mother's road"
        Verb: "had been" (linking verb)
        Predicate Adjective: "cultural"

This is exactly correct. From that it produced the following ASCII art "diagram", which looks like nonsense to me, but maybe it's in some kind of sentence-diagramming formalism I haven't seen:

                If
        ______/______
       |              |
    my mother's    had been
      road             |
                        cultural
       
   my uncle's sloth squad residue
                 |
    would have compiled
                 |
  a periodic meadow of night
              |
            of
            night

I told it the diagram looked like nonsense, and it produced this one, which looks worse:

                 If
           _______/______
          |               |
  my mother's road    had been
           |             |
       cultural      (Main Clause)
                       |
       my uncle's sloth squad residue
                           |
                 would have compiled
                           |
               a periodic meadow
                         |
                       of night

I asked GPT-4 (paid) the same question, and it gave me another exactly correct grammatical analysis in words:

  1. **Identify the main clauses**: 
     - "If my mother's road had been cultural" is a dependent clause (specifically, a conditional clause).
     - "my uncle's sloth squad residue would have compiled a periodic meadow of night" is the main clause.

  2. **Start with the main clause**:
     - Subject: "my uncle's sloth squad residue"
       - "my uncle's" modifies "sloth squad," which in turn modifies "residue."
     - Predicate: "would have compiled"
       - This includes the modal auxiliary verb ("would"), perfect auxiliary ("have"), and main verb ("compiled").
     - Direct object: "a periodic meadow of night"
       - “a” is an article modifying “meadow,”
       - “periodic” modifies “meadow,”
       - “of night” is a prepositional phrase modifying “meadow.”

  3. **Diagram the dependent clause**:
     - Subject: "road"
       - Modified by possessive noun phrase “my mother’s”
     - Predicate: "had been"
     - Complement/Adjective: "cultural"

(I feel like this is definitely "a level that you'd assume to be competitive to a human", in mschuster91's phrase; most native English speakers can't do that.)

But then it came up with the following incorrect diagram:

                            +---------------------------------------------------+
                            |                      Main Clause                   |
                            +---------------------------------------------------+
                                    /                |               \
                                   /                 |                \
                                  /                  |                 \
                         [My uncle's]         [sloth squad]          [residue]
                             \                    /                     /
                              \                  /                     /
                               \                /                     /
                                would have compiled -------------------
                                                  \
                                                   \
                                                    \
                                                     [a periodic meadow of night]
                                                      /           |            \
                                                     /            |             \
                                                    a        periodic       [of night]

  +---------------------+
  |  Dependent Clause   |
  +---------------------+
        If
         |
        road
      ___|_____
    |          |
   my mother's had been cultural

Telling it the diagram was nonsense didn't help.

But my conclusion from this is not that GPT-4 can't analyze the grammatical structure of sentences it had never seen. It's that GPT-4 gets "confused" when it tries to draw ASCII art, like a person trying to tell you how to draw it over the telephone, but not quite as bad. I've seen similarly abominable results when it tried to draw circuit schematics and guitar tablatures as ASCII art. I tried asking it for a graphviz diagram, which was syntactically correct graphviz code but didn't express the sentence structure properly.

I tried simplifying the problem with the prompt, 'Please provide a syntactic tree showing the syntactic derivation of the sentence, "My uncle's sloth squad residue would have compiled a periodic meadow of night," using graphviz syntax and with individual words at the leaf nodes of the tree.' which yielded a slightly incorrect diagram. (I'm still sure nobody has ever uttered this sentence before.) On being told, "This diagram is not quite right.", GPT-4 produced the following diagram, which I believe is correct except for an extra "s" on "uncle" (you can see this in your browser at https://is.gd/Axufjx):

  digraph G {
      node [shape=plaintext];

      S [label="S"];
      NP [label="NP"];
      VP [label="VP"];

      S -> NP;
      S -> VP;

      // Noun Phrase breakdown
      DP_my [label="DP"];
          DP_my -> My;

      UnclePossessive [label="NP"];
          UnclePossessive -> Uncles;
          UnclePossessive -> ApostropheS;

          NP -> DP_my;
          NP -> UncleCompound;

      UncleCompound [label="NP"];
          SlothSquadResidue[label = "N'"];
              SlothSquadResidue-> Sloth;
              SlothSquadResidue-> SquadResidue;

              SquadResidue[label = "N'"]
                  SquadResidue-> Squad;
                  SquadResidue-> Residue;

              UncleCompound-> UnclePossessive;
              UncleCompound-> SlothSquadResidue;


      // Verb Phrase breakdown
       ModalWould[label = "Modal"];
       ModalWould-> Would;

       AuxHave[label = "Aux"];
       AuxHave-> Have;

       VCompiled[label = "V"];
       VCompiled-> Compiled;

       ObjectNP[label = "NP"];  // Direct object noun phrase
           ArticleA[label = "D"];
           ArticleA-> A;

           MeadowPhrase[label = "N'"];  // 'Meadow' modified by 'periodic'
               AdjPeriodic[label = "Adj"];
               AdjPeriodic-> Periodic; 

               MeadowWord[label = N];
               MeadowWord -> Meadow; 

               MeadowPhrase -> AdjPeriodic; 
               MeadowPhrase -> MeadowWord; 

           PPofNight[label = PP];  // Prepositional phrase 'of night'
               PrepOf[label=P]
                   PrepOf-> Of; 

                   NightNounPhrase[label=N]
                       NightNounPhrase-> Night; 

                   PPofNight -> PrepOf; 
                   PPofNight -> NightNounPhrase; 

           ObjectNP -> ArticleA; 
           ObjectNP -> MeadowPhrase; 
           ObjectNP -> PPofNight;


       VP -> ModalWould;
       VP -> AuxHave;
       VP -> VCompiled
       VP -> ObjectNP

  }

Claude produced a reasonably serviceable diagram on the first try with that prompt:

  digraph SyntacticTree {
    node [shape=plaintext];

    S -> NP1;
    S -> VP;

    NP1 -> DET1 [label="Det"];
    NP1 -> NP2 [label="NP"];

    NP2 -> NP3 [label="NP"];
    NP2 -> N1 [label="N"];

    NP3 -> N2 [label="N"];
    NP3 -> POS [label="Pos"];

    VP -> AUX1 [label="Aux"];
    VP -> AUX2 [label="Aux"];
    VP -> V [label="V"];
    VP -> NP4 [label="NP"];

    NP4 -> DET2 [label="Det"];
    NP4 -> ADJ [label="Adj"];
    NP4 -> N3 [label="N"];
    NP4 -> PP [label="PP"];

    PP -> P [label="P"];
    PP -> NP5 [label="NP"];

    NP5 -> N4 [label="N"];

    DET1 -> "My";
    N2 -> "uncle";
    POS -> "'s";
    N1 -> "sloth";
    N1 -> "squad";
    N1 -> "residue";
    AUX1 -> "would";
    AUX2 -> "have";
    V -> "compiled";
    DET2 -> "a";
    ADJ -> "periodic";
    N3 -> "meadow";
    P -> "of";
    N4 -> "night";
  }

On being told, I think incorrectly, "This diagram is not quite right.", it produced a worse diagram.

So LLMs didn't perform nearly as well on this task as I thought they would, but they also performed much better than you thought they would.

emkee 10 months ago

Having only taken one syntax class for fun in college, I find this pretty impressive. Generating syntax trees was never a trivial task for me (but I was just a CS major who needed a credit). Slightly related, but I have also never had ChatGPT successfully generate ASCII art, even with extensive conversation.

2 replies →
CamperBob2 10 months ago

(I know that it's generally rude to include LLM output in HN comments, but in this case I think it's essential supporting material to elevate the discussion of LLM capabilities above "yes it is", "no it isn't".)
You just have to be prepared to take a karma hit for it. The audience here does not consist largely of 'hackers', but seems to skew toward the sort of fearful, resentful reactionaries that hacker culture traditionally opposes.
I will say I wouldn't peg ChuckMcM as being one of the reactionaries, though. That would be an unpleasant surprise.
As far as the diagram goes, my guess is that sentence diagrams were underrepresented in the training corpus. Diagramming sentences was already out of fashion when I was in school in the 1980s -- in fact, I don't recall ever having done it. The model is struggling much the same way you'd expect a grade-school student (or me, I guess) to struggle upon being asked to perform the task for the first time.
Knowing when to say "I don't know how to do that" is still a foundational weakness of LLMs, but I don't expect it to remain unaddressed for long. We will see improvement in that area, sooner or later. The anklebiters will respond by moving their goalposts and hitting the downvote button as usual.

1 reply →