Comment by naasking

2 years ago

> They have been trained on more information than a human being can hope to even see in a lifetime. Assuming a human can read 300 words a min and 8 hours of reading time a day, they would read over a 30,000 to 50,000 books in their lifetime. Most people would manage perhaps a meagre subset of that, at best 1% of it. That’s at best 1 GB of data.

This just isn't true. Human training is multimodal to a degree far beyond even the most capable multimodal model, so human babies arguably see more data by a young age than all models collectively have seen.

Not to mention that human babies don't even start as a blank slate as LLMs do, billions of years of evolution have formed the base model described by our DNA.

7 comments

naasking

cs702 2 years ago

I agree with you, but your comment strikes me as unfair nitpicking, because the OP is referring to information that has been encoded in words.

nickpsecurity 2 years ago

We learn the ideas from each mode of input. Then, one mode can elaborate on data learned from another mode. They build on each other.
From there, remember the text is usually a reflection of things in the real world. Understanding those things in non-textual ways both gives meaning to and deeper understanding of the text. Much of the text itself was even stored in other modes, like markup or PDF’s, whose structure tells us things about it.
That we learn multimodal from birth is therefore an important point to make.
It might also be a prerequisite for AGI. It could be one of the fundamental laws of information theory or something. Text might not be enough like how digital devices need analog to interface with the real world.
naasking 2 years ago
I understand that's the context, but I'm not sure that it's unfair nitpicking. It's common to talk about training data and how poor LLMs are compared to humans despite the apparently larger dataset than any human could absorb in a lifetime. The argument is just wrong because it doesn't properly quantify the dataset size, and when you do, you actually conclude the opposite: it's astounding how good LLMs are despite their profound disadvantage.
- cs702 2 years ago
  
  > I understand that's the context, but I'm not sure that it's unfair nitpicking.
  The OP is about much more than that, and taken as a whole, suggests the author is well aware that human beings absorb a lot more data from multiple domains. It struck me as unfair to criticize one sentence out of context while ignoring the rest of the OP.
  > It's common to talk about training data and how poor LLMs are compared to humans despite the apparently larger dataset than any human could absorb in a lifetime.
  Thank you. Like I said, I agree. My sense is the author would agree too.
  It's possible that to overcome some of the limits we're starting to see, AI models may need to absorb a giant, endless, torrential stream of non-textual, multi-domain data, like people.
  At the moment, we don't know.
a_wild_dandan 2 years ago

Other modalities affect word semantics. You cannot ignore them when discussing sample efficiency.

cess11 2 years ago

Some people seem to be unaware that reality is analog, possibly fractal.

kelipso 2 years ago

The quantum vibrations I feel against my consciousness cannot be modeled electronically!