Comment by throw310822

12 hours ago

> The training data

If the prompt is unique, it is not in the training data. True for basically every prompt. So how is this probability calculated?

8 comments

throw310822

cbovis 11 hours ago

The prompt is unique but the tokens aren't.

Type "owejdpowejdojweodmwepiodnoiwendoinw welidn owindoiwendo nwoeidnweoind oiwnedoin" into ChatGPT and the response is "The text you sent appears to be random or corrupted and doesn’t form a clear question." because the prompt doesnt correlate to training data.

HDThoreaun 2 hours ago

Or because the text you send was random and doesnt form a clear quesiton?
hmmmmmmmmmmmmmm 11 hours ago

...? what is the response supposed to be here?

qsera 11 hours ago

Just using a scaled up and cleverly tweaked version of linear regression analysis...

red75prime 6 hours ago

That is, the probability distribution that the network should learn is defined by which probability distribution the network has learned. Brilliant!

hmmmmmmmmmmmmmm 11 hours ago

Hamiltonian paths and previous work by Donald Knuth is more than likely in the training data.

red75prime 6 hours ago
The specific sequence of tokens that comprise the Knuth's problem with an answer to it is not in the training data. A naive probability distribution based on counting token sequences that are present in the training data would assign 0 probability to it. The trained network represents extremely non-naive approach to estimating the ground-truth distribution (the distribution that corresponds to what a human brain might have produced).
- qsera 3 hours ago
  
  >the distribution that corresponds to what a human brain might have produced..
  But the human brain (or any other intelligent brain) does not work by generating probability distribution of the next word. Even beings that does not have a language can think and act intelligent.