← Back to context

Comment by trompetenaccoun

2 years ago

I understand why the mainstream thinks that but it's incredibly annoying that even in tech circles there is very little meaningful discussion, it's mainly just people posting amusing screenshots purportedly showing how smart GPT3 is or in other cases how it's politically biased.

Anyone who's played around with it knows that it's fun but it's not a search engine replacement and it doesn't know nor understand things. It regularly gives complete misinformation and when you ask for a source it makes up fake URLs and fake science papers on top of that.

It's nothing like described in the article and I don't understand why people who should know better don't call out the bullshit media reporting more. We've had GPT3 for ages, it's not like most of us only tried it since Chat GPT3 came out, right?

> It's nothing like described in the article and I don't understand why people who should know better don't call out the bullshit media reporting more.

I'm kind of assuming you didn't read the article, but if you did then I'm kind of assuming that you've never done machine learning, but if you have: how did you manage to do that without ever noticing that you were doing approximation?

Objectively, neural networks are approximators. Like, truly objectively, as in, the literal objective function, objectively minimizes approximation error. We call them objective functions and minimizing the approximation error is typically the objective of these objective functions. This isn't bullshit. It isn't. If you think it is, you are deeply and profoundly mistaken.

The article advances this view of language models. This is a reasonable view of language models for the same reason that machine learning papers exploring neural networks describe them as universal function approximators.

> We've had GPT3 for ages, it's not like most of us only tried it since Chat GPT3 came out, right?

For myself, I tried GPT models and read the Attention Is All You Need paper before ChatGPT. I also read analysis, for example, from Gwern, about capability overhangs and underestimation of present capabilities in these models. In many cases, I found myself agreeing with the logic. I found that it was very possible to coax greater capabilities out of the model than many presumed them to have. I still find this to be the case and have, in recent memory demonstrated this is true in present models: for example, I posted a method of coaxing the solving of some puzzles by prompting to include the representation of intermediate states in order to successfully solve problems related to reasoning puzzles of a 'can contain' nature, which was a capability that someone claimed these language models lack, despite them gaining that capability when appropriately prompted, which suggests that they always had that capability in their weights, but that it wasn't exercised successfully - the capability was there, but not used, rather than absent, as claimed by the people who claimed it was absent.

That said, I don't think it matters much what most people did or didn't do with regard to this experimentation and, as you imply, ages really did past - I would feel trepidation, not hope, about the quality of my ideas compared to the people who came later. Historically, the passing of ages tends to improve, not diminish. So if I was experimenting with, for example, flying machines in the 1700s, but then ages past and someone who did not do that experimenting was talking to me about flying machines in the early 2000s, I would suspect them to be more informed, not less informed, than I was. They, as a matter of course in casual classroom settings, have probably done better than my best experiments including high effort costly experiments. Their toys fly. A generation ago, we would talk about planes, but now we can also talk about their toys. It is that normal to them. They have so much better priors.

> Anyone who's played around with it knows that it's fun but it's not a search engine replacement and it doesn't know nor understand things.

I've noticed that some people have been talking as if the view is nonsense in the direction you imply it is nonsense, but I think the argument for a lack of sensory ties is much stronger when stated in the other direction.

This fails when arguing against the straw man position, but of course, drinking water also fails when addressing the weakest possible justification for it - clearly drinking water is bad because if you drink water you die of osmosis, though technically true, isn't really reflective of the actual predictions made by the people advancing the claim that we ought to drink more water than we do.

So I'll give the argument for language models replacing search engines not being nonsense, but the position that no one could arrive at the belief actually being nonsense.

Lets start with people not arriving at that belief being nonsense. I think it is nonsense in two ways. One way is that because some people have used them, then claimed they would be replacing search engines for some queries, it follows that you deny your senses with regard to the existence of these people. This is not sensory tied. So it is non-sensical. The second way is your belief anticipates experiences, but this experience you are now having in which someone disagrees with you is in contradiction to the anticipated experienced supposed by the belief. So it fails to predict past experiences and fails to predict current experiences, so it probably fails to predict future experiences. There probably will be someone in the future who, after interacting with a language model, thinks it can serve as a replacement for some search engine queries.

Now in contrast, their claims were not non-sensical when they thought they could replace some search engine queries with language model queries. By that I also mean that they were not non-sensical in two respects. The first is that they arrived at the belief after replacing queries to search engines with queries to language models. Then, after having found value in doing so, they announced a belief congruent with that value, intimately connecting their belief to their senses. The second is that, having arrived at their belief, their belief paid rent in anticipated experience: it successfully predicted three noteworthy events: the internal code red status at Google, you.com's language model as enrichment for search, and bing.com's use of language model as enrichment for search. So it successfully predicted past experiences, successfully predicted current experiences, and may or may not predict future experiences - I think most people who hold this view tend to think there will be some refinements on the current generation of language models, in particular, that they will further refined at query time with relevant data, like fact databases, to help correct for some of the existing approximation error. This is my belief. I anticipate this happening. You can judge whether I anticipate successfully by watching reality. I should note that part of the reason I think this will happen is because I've already seen it happen. So I'm not really making a bold prediction here, but I suspect you will later see this happen, because I've seen this happen.

Anyway... the belief that language models can replace some search queries is not a non-sensical belief, like beliefs in fairies is, but the belief that no one arrives at a belief that they could be is a non-sensical belief, because people do arrive at such beliefs, and so therefore the belief that they don't is fanciful and not reflective of reality.