Comment by pllbnk

3 months ago

> It was already pretty clear that this was not the case as even GPT3 could do summarization well enough, and there is no probabilistic link between the words of a text and the gist of the content, <...>

I am not an expert by any means but have some knowledge of the technicalities of the LLMs and my limited knowledge allows me to disagree with your statement. The models are trained on an ungodly amount of text, so they become very advanced statistical token prediction machines with magic randomness sprinkled in to make the outputs more interesting. After that, they are fine tuned on very believable dialogues, so their statistical weights are skewed in a way that when subject A (the user) tells something, subject B (the LLM-turned-chatbot) has to say something back which statistically should make sense (which it almost always does since they are trained on it in the first place). Try to paste random text - you will get a random reply. Now try to paste the same random text and ask the chatbot to summarize it - your randomness space will be reduced and it will be turned into a summary because the finetuning gave the LLM the "knowledge" what the summarization _looks like_ (not what it _means_).

Just to prove that you are wrong: ask your favorite LLM if your statement is correct and you will probably see it output that it is not.

0 comments

pllbnk

No comments yet

Contribute on Hacker News ↗