← Back to context

Comment by DonHopkins

12 days ago

>under the assumption that that is what has occurred in the arbitrarily selected training data

Under that assumption, the correct prediction is "dog", not "dogs", because LLMs are trained on the wikipedia page https://en.wikipedia.org/wiki/The_quick_brown_fox_jumps_over... [..._the_lazy_dog] and other earlier publications like The Boston Journal "Current Notes" article, and Linda Bronson's 1888 book "Illustrative Shorthand", and not the version you picked up as a kid.

Can you cite any sources that says "dogs" that are more popular than the wikipedia page or better known than the 19th century publications it cites?

I will ask several popular LLMs what they think, even using four underscores to imply a four letter word:

  Please complete: The quick brown fox jumped over the lazy ____

  ChatGPT o4-mini-high said: dog
  Claud-4-Opus said: dog
  Gemini-2.5-Pro said: dog
  DeepSeek-v3.1 said: dog
  Grok-3 said: dog

How much more proof do you need? Do you think all 5 LLM and I are hallucinating? Can you show me any LLM that actually predicts "dogs"?

If you can't cite any wikipedia pages, books, articles, or other examples of an LLM that predicts "dogs", do you still stand by your bold claim that:

>Correct prediction is "dogs".

Your mistake was confidently but incorrectly using the word "correct". In the context of predicting the most likely word, the correct, highest probability answer by far is unequivocally "dog", not "dogs".

That's why the wikipedia page, which all mainstream LLMs are trained on, has the title that it does.

The other hypocritical mistake you made was saying the following right after YOU incorrectly tried to pick what is "correct":

>You don't get to pick what is "correct"

So am I to understand that even though I DON'T get to pick what is correct (which I'm not presuming to do: I'm just deferring to wikipedia, two 19th century publications, and 5 LLMs), but YOU DO get to pick what is "correct", even if it contradicts all available evidence? Because that's exactly what you're doing! Please explain how you got such an awesomely important, exclusive, earth shattering superpower? Were you bitten by a radioactive web crawler?

If you boldly claim to be the only person who has the actual power to pick what's correct and what's not, then there are so many other much more important wrongs you should be righting than arguing about "dog" vs "dogs" on hacker news.

For more evidence, you can even read the talk page and archive yourself. There are discussions about alternatives, but NONE of the alternatives ever use the word "dogs".

Talk: https://en.wikipedia.org/wiki/Talk:The_quick_brown_fox_jumps...

Talk Archive: https://en.wikipedia.org/wiki/Talk:The_quick_brown_fox_jumps...

The larger point is that your argument is reductionist, and you're exhibiting the same kind of mistakes that people point to when they criticize LLMs for their overconfident hallucinations.