Comment by samatman
1 year ago
"Does it generalize past the training data" has been a pre-registered goalpost since before the attention transformer architecture came on the scene.
1 year ago
"Does it generalize past the training data" has been a pre-registered goalpost since before the attention transformer architecture came on the scene.
If there is a difference, and LLM's can do one but not the other...
Then what the fuck are they doing.
Learning is thinking, reasoning, what have you.
Move goalposts, re-define words, it won't matter.