Comment by YeGoblynQueenne

2 years ago

>> But it doesn’t mean the principles of machine learning (which are loosely derived from how the brain actually works) apply only to text data or narrow categories of data like you mentioned.

When you say "the principles of machine learning", I'd like to understand what you mean.

If I were talking about "principles" of machine learning, I'd probably mean Leslie Valiant's Probably Approximately Correct Learning (PAC-Learning) setting [1] which is probably the most popular (because the most simple) theoretical framework of machine learning [2].

Now, PAC-Learning theory is probably not what you mean when you say "principles of machine learning", nor is it any of the other theories of machine learning we have, that formalise the learnability of classes of concepts. That's clear because none of those theories are "derived from how the brain actually works", loosely or not.

Mind you, there isn't any "principle", of machine learning, anyway, that I know of that is really "derived" from how the brain actually works; because we don't know how the brain actually works.

So, based on all this, I believe what you mean by "principles of machine learning" is some intuition you have about how _neural networks_, work. Those were originally defined according to then-current understanding of how _neurons_ in the brain "work". That was back in 1943, by Pitts and McCulloch [3], what is known as the Perceptron. That model is not used any more and hasn't for many years.

Still, if you are talking about neural networks, your intuition doesn't sound right to me. With neural nets, like with any other statistical learning approach, when we train on examples x of a class y, we learn the clas y. If we want to learn clases y', y", ... etc, we must train on examples x', x", ... and so on. You have to train neural nets on examples of what you want them to learn, otherwise, they won't learn, what you want them to learn.

The same goes with all of machine learning, following from PAC-Learning: a learner is given labelled instances of a concept, drawn from a distribution over a class of concepts, as training examples. The learner can be said to learn the class, if it can correctly label unseen instances of the class with some probability of some degree of error, with respect to the true labelling.

None of this says that you can train a nerual net on images and have it learn to generate text, or vice-versa, train it on text and have it recognise images. That is certainly not the way that any technology we have now works.

Does the human brain work like that? Who knows? Nobody really knows how the brain works, let alone how it learns.

So I don'tthink you're talking about any technology that we have right now, nor are you accurately extrapolating current technology to the future.

If you are really curious about how all this stuff works, you should start by doing some serious reading: not blog posts and twitter, but scholarly articles. Start from the ones I linked, below. They are "ancient wisdom", but even researchers, today, are lost without them. The fact that most people don't have this knowledge (because, where would they find it?) is probably why there is so much misunderstanding on the internet of what is going on with LLMs and what they can develop to in the long term.

Of course, if you don't really care and you just want to have a bit of fun on the web, well, then, carry on. Everyone's doing that, at the moment.

____________

[1] https://web.mit.edu/6.435/www/Valiant84.pdf

[2] There's also Vladimir Vapnik's statistical learning theory, Rademacher complexity, and older frameworks like Learning in the Limit etc.

[3] https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCullo...