Comment by jdnier

9 hours ago

> I think Claude Shannon’s spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude!

I didn't realize Claude was named after Claude Shannon!

https://en.wikipedia.org/wiki/Claude_Shannon

Trivia: Claude Shannon proposed the idea of predicting the next token (letter) using statistics/probabilities in the training data corpus in 1950: "Prediction and Entropy of Printed English" https://languagelog.ldc.upenn.edu/myl/Shannon1950.pdf

  • It goes back a bit further than that. His 1948 “Mathematical theory of communication” [1] already has (what we would now call) a Markov chain language model, page 7 onwards. AFAIK, this was based on his classified WWII work so it was probably a few years older than that

    [1] https://people.math.harvard.edu/~ctm/home/text/others/shanno...

    • I was just reading Norbert Wiener's "The Human Use of Human Beings" (1950) and this quote gave me a good chuckle:

      "One may get a remarkable semblance of a language like English by taking a sequence of words, or pairs of words, or triads of words, according to the statistical frequency with which they occur in the language, and the gibberish thus obtained will have a remarkably persuasive similarity to good English."

  • A letter is not a token, is it? Redundancy could hit 75% in long sentences, but Shannon was not predicting tokens or words, he was predicting letters (characters).

And Claude had a collection of cycles, unicycles. Unfortunately the article is about something else altogether.