Comment by bbor

2 years ago

Ugh I’m beginning to think I’m going to spend the next 6-12 months commenting “no, large language models aren’t supposed to somehow know everything in the world. No, that’s not what they’re designed for. Yes, hooking one up to our long-standing record-of-everything-in-the-world (google’s knowledge graph) is going to be powerful.”

It’s getting to point where I need to consider stop going on HN. This is like when my father excitedly told his friends about the coming computer revolution in the 90s and they responded “well it can’t do my dishes or clean the house, they’re just a fad!” Makes me want screaaaaam

I appreciate where you are coming from and I agree that AI is about to go from relative obscurity where just a few geeks were playing around to insane hype. I feel like I’ve spent the last 7 years wondering why no one in the wider world was as impressed as I was, but starting with Stable Diffusion and now ChatGPT, the hype rocket ship has launched. Search TikTok for ChatGPT for all the evidence of that you could ever need.

That said, I still think we are in for a wild ride, even if we go through a hype bubble and pop first. I really don’t think the current crop of Transformer LLMs are the end of the story. Im betting that we are headed towards architectures made up of several different kind of models and AI approaches just like the brain is an apparent concert of specialized regions. You can see that in the new Bing where it’s a combination of a LLM with static training set that can then do up to 3 web searches to build up additional context of fresh data for the prompt, overcoming one of the key disadvantages of a transformer model. The hidden prompt with plain English Asimov's laws are the icing on the cake.

The hype will be insane, but the capabilities are growing quickly and we do not yet seem close to the end of this rich computational ore vein we have hit.

  • I don’t have much of a response but this comment is very well written and exactly what I would say if I could write clearly. It’s an exciting time! Hopefully it turns out more like the internet and less like… idk I’m struggling to think of a bad invention haha. Quadcopters I guess

You don't need to correct every wrong thing you read. In fact you will probably feel much better if you don't ever do it at all, or at least take a break for while.

  • Very true :). It doesn’t help that this isn’t exactly a little blog post, it’s a popular New Yorker feature…

    • A published "real" news article like this is actually one of the more futile things to try to "correct" IMO. Some guy on a blog might publish a correction or change their view. The New Yorker probably won't (at least not based on an HN comment).

      2 replies →

    • Still that is what the downvote button for. With a comment extolling why your opinion is further valid than the net points that support it, seems an exercise in ego that is not beneficial to either you or the community.

  • Very true, but it is hard not be frustrated reading the constant stream of confidently incorrect information about this topic. I left /r/programming similar reasons and it is sad to experience the same on HN.

  • When my pedantic keyboard warrior gears start turning, I think about the same xkcd a sibling commenter posted.

    But I've been struggling with the recent tsunami of openly anti-intellectual, alt-everything pseudoscience in the US. I recently pleaded with my sister to read The Demon-Haunted World by Sagan.

    There's something important in patient advocacy of truth.

I’m just searching the comments for novel use cases where its effective. Most articles I’ve read seem like either moral panics or snake oil.

I like how it can generate songs and poems based on a prompt. Its not particularly useful, but it is entertaining. It really does seem curated at times, leading me to think this will eventually become a fad or replaced by a more advanced algorithm.

  • The most potent use case of the current implementation of ChatGPT is plagiarism.

    Businesses who need blog posts for content marketing would hire copywriters who would paraphrase my content and publish it as their own. Now they can do it faster and for free. Blogspam got cheaper and more effective. This applies to other kinds of spam too: email spam, forum spam, comment spam... Spam spam spam. I'm afraid that we'll soon drown in even more noise.

    Again, this is just the current implementation, which has not reached its full potency. We're already seeing this combined with voice and video generation. Then it's students doing their homework with it, and even professionals.

    This is a game changer. You can argue that it makes the game better or worse, but you can't deny that it's changing the rules.

    • Or less cynically, it's a really good writing assistant. There is a self-service tool that understands English on a level that Grammerly could only dream of.

      Not sure how to phrase an email in a professional tone, tell GPT to write it and edit from there. This is a tool that is going to be a smash it with non-native speakers or really anyone who just isn't that good with words.

      > Write an email to a coworker telling them in the politest and most professional manner that they're a fucking idiot who doesn't know what they're talking about and that I'm the subject matter expert on this topic. Don't swear and have a firm but neutral tone.

      I understand that you have a great interest in the topic we were discussing earlier, and I appreciate your enthusiasm. However, I feel it is important to clarify a few points.

      As the subject matter expert on this topic, I believe it is important to note that there are certain aspects that may not be fully understood without a deeper understanding of the material. With that being said, I would like to offer my assistance in making sure that all aspects of the topic are properly understood.

      If you have any further questions or would like to discuss the topic in more detail, I am more than happy to assist. Please do not hesitate to reach out at any time.

      2 replies →

  • its no fad. I use to spend an hour going over an email to management. make it less technical, make it sound nicer / polite, etc. now I can take a sentence and say 'make this more succinct. ok, but take that and make it more polite, great thanks'.

    I even used it for project proposals. 'write me a 1 page document for this idea'. and then I just use the scaffolding from there. its a huge time saver.

    also you are notn seeing the bigger picture. "I like how it can generate songs and poems based on a prompt. Its not particularly useful, but it is entertaining. " do you realize how this is going to make writing (books, songs, tv shows, movies) so much easier?

    I had a dream, all I did was give chatgpt the basics of the dream and I said make this into a short story. the results weren't bad, its definitely something one can work with. I think content creation on platforms is about to explode.

  • Want niche use case? I feed chatgpt the whole warstuff one page rule miniature combat system, and I can ask it to generate thematic units.

    I generated a wide variety of content from LOTR to ships.

    it even created new trait to complement the existing one when needed by special units, along with explanation of the mechanics of the trait.

    It doesn't quite understand positioning, but it will simulate round of combat between unit when asked.

    • You’re by far the most interesting person in this thread - I’ve been coding up a Warhammer 40k (symbolic) AI for a while, but I bet chatgpt could blow it out of the water…

      1 reply →

  • I have openly used LLMs to build a custom prompt engine that allows me to make systemic refactors across an entire open source codebase of mine (of 250 files) with a bash command, in parallel.

    There are big changes on the horizon.

Powerful for what? To use Chiang's analogy, do you think that an LLM trained on Web content will actually derive the rules of arithmetic, physics, etc. I think it is more likely that in decade or more a majority of Internet content will be generated by machine and search engines will do a great job of indexing increasingly meaningless information.

  • You’re missing an important point - it’s not /trained/ on live internet content, it /reads/ that content at runtime. I mean it is trained on the internet but please try to separate the concerns. Remember that the goal of this model is language, not learning facts about the world - they could’ve trained it completely on fictional novels if there was a big enough corpus.

    The only way that LLM-enhanced search returns misinformation is if the internet is full of misinformation. So yeah we’re still in trouble, but the inclusion of the LLM isn’t going to affect that factor either way IMO

    EDIT: this is completely separate from using LLMs to, say, write political statements for Facebook bots and drown out all human conversations. That’s obviously terrifying, but not related to their use in search engines IMO.

> This is like when my father excitedly told his friends about the coming computer revolution in the 90s and they responded “well it can’t do my dishes or clean the house, they’re just a fad!” Makes me want screaaaaam

Did the computer revolution make your father’s life better?

Serious question.

  • Hmm. I’d say definitely yes. I mean we are on the internet right now, presumably across many km of distance. Do you disagree?

    • I have mixed feelings about it.

      I do believe with certainty that there are many people, millions, whose life has been made substantially worse by the invention of ubiquitous computing devices.

      Probably the minority. But I’d say his questions (ie “What’s in it for me?”) are excellent ones to pose in the face of new technology.

      2 replies →

    • That you're talking over a great distance isn't necessarily an improvement. Didn't internet influence how much people are meeting in person?

> Yes, hooking one up to our long-standing record-of-everything-in-the-world (google’s knowledge graph) is going to be powerful.

This hasn't happened yet and, while I may just lack imagination, despite having a fairly solid understanding of how the latest round of AI works I can't see how it can be done successfully. Until it is in fact done and my lack of imagination is demonstrated, your "going to be powerful" is a speculation about the future, not an observation about the present, and deserves the level of respect usually accorded to such speculations.

  • In my view it’s very simple, which is what makes it so exciting. Summarizing a design doc that I imagine Microsoft and google are both spending millions of dollars of manhours/day working on their versions of:

    1. User enters query.

    2. LLM augments query if necessary, adding extra terms or clauses.

    3. Normal search pipeline returns ranked links, just like it does now.

    4. LLM reads the content of the first 100 links and decides which are the best based on your stated preferences and past behavior, uses that to augment the ranking a bit.

    5. LLM generates various summaries depending on the type of query, such as laying out a few common answers to a controversial political question or giving a summary of a technical Wikipedia article tailored to your expertise level in that field.

    6. Finally, for a tiny subset of queries, maybe the user wants to converse with the AI in a chat-like format, where it cites all of its claims with direct links.

    It’s gonna be awesome :)

Same. I try to give myself three responses to any particular fallacy, mostly to work through my thinking. Once I have a pretty solid “you are assuming a technical limitation in a first gen product will persist in all future evolutions, when the second gen has already addressed it” response, I try to just smile at the 4th. . . Nth people who discover the exact same topic.

>long-standing record-of-everything-in-the-world (google’s knowledge graph)

The internet is not record-of-everything-in-the-world by a long stretch.

It's understandable how frustrating it can be to encounter skepticism and misunderstanding about the capabilities of large language models. However, it's important to remember that these models are still relatively new and not everyone is familiar with their potential uses and limitations.

It's also worth noting that these models are not designed to replace human intelligence, but rather to augment it and provide valuable insights and assistance in various tasks. And while connecting them to large knowledge graphs like Google's can be powerful, it's still only one piece of the puzzle.

It can be discouraging to face resistance, but it's important to keep in mind that advancements in technology often encounter initial skepticism before they become widely adopted. Just like the computer revolution in the 90s, it will take time for people to fully understand and appreciate the benefits of large language models.

I understand why the mainstream thinks that but it's incredibly annoying that even in tech circles there is very little meaningful discussion, it's mainly just people posting amusing screenshots purportedly showing how smart GPT3 is or in other cases how it's politically biased.

Anyone who's played around with it knows that it's fun but it's not a search engine replacement and it doesn't know nor understand things. It regularly gives complete misinformation and when you ask for a source it makes up fake URLs and fake science papers on top of that.

It's nothing like described in the article and I don't understand why people who should know better don't call out the bullshit media reporting more. We've had GPT3 for ages, it's not like most of us only tried it since Chat GPT3 came out, right?

  • > It's nothing like described in the article and I don't understand why people who should know better don't call out the bullshit media reporting more.

    I'm kind of assuming you didn't read the article, but if you did then I'm kind of assuming that you've never done machine learning, but if you have: how did you manage to do that without ever noticing that you were doing approximation?

    Objectively, neural networks are approximators. Like, truly objectively, as in, the literal objective function, objectively minimizes approximation error. We call them objective functions and minimizing the approximation error is typically the objective of these objective functions. This isn't bullshit. It isn't. If you think it is, you are deeply and profoundly mistaken.

    The article advances this view of language models. This is a reasonable view of language models for the same reason that machine learning papers exploring neural networks describe them as universal function approximators.

  • > We've had GPT3 for ages, it's not like most of us only tried it since Chat GPT3 came out, right?

    For myself, I tried GPT models and read the Attention Is All You Need paper before ChatGPT. I also read analysis, for example, from Gwern, about capability overhangs and underestimation of present capabilities in these models. In many cases, I found myself agreeing with the logic. I found that it was very possible to coax greater capabilities out of the model than many presumed them to have. I still find this to be the case and have, in recent memory demonstrated this is true in present models: for example, I posted a method of coaxing the solving of some puzzles by prompting to include the representation of intermediate states in order to successfully solve problems related to reasoning puzzles of a 'can contain' nature, which was a capability that someone claimed these language models lack, despite them gaining that capability when appropriately prompted, which suggests that they always had that capability in their weights, but that it wasn't exercised successfully - the capability was there, but not used, rather than absent, as claimed by the people who claimed it was absent.

    That said, I don't think it matters much what most people did or didn't do with regard to this experimentation and, as you imply, ages really did past - I would feel trepidation, not hope, about the quality of my ideas compared to the people who came later. Historically, the passing of ages tends to improve, not diminish. So if I was experimenting with, for example, flying machines in the 1700s, but then ages past and someone who did not do that experimenting was talking to me about flying machines in the early 2000s, I would suspect them to be more informed, not less informed, than I was. They, as a matter of course in casual classroom settings, have probably done better than my best experiments including high effort costly experiments. Their toys fly. A generation ago, we would talk about planes, but now we can also talk about their toys. It is that normal to them. They have so much better priors.

  • > Anyone who's played around with it knows that it's fun but it's not a search engine replacement and it doesn't know nor understand things.

    I've noticed that some people have been talking as if the view is nonsense in the direction you imply it is nonsense, but I think the argument for a lack of sensory ties is much stronger when stated in the other direction.

    This fails when arguing against the straw man position, but of course, drinking water also fails when addressing the weakest possible justification for it - clearly drinking water is bad because if you drink water you die of osmosis, though technically true, isn't really reflective of the actual predictions made by the people advancing the claim that we ought to drink more water than we do.

    So I'll give the argument for language models replacing search engines not being nonsense, but the position that no one could arrive at the belief actually being nonsense.

    Lets start with people not arriving at that belief being nonsense. I think it is nonsense in two ways. One way is that because some people have used them, then claimed they would be replacing search engines for some queries, it follows that you deny your senses with regard to the existence of these people. This is not sensory tied. So it is non-sensical. The second way is your belief anticipates experiences, but this experience you are now having in which someone disagrees with you is in contradiction to the anticipated experienced supposed by the belief. So it fails to predict past experiences and fails to predict current experiences, so it probably fails to predict future experiences. There probably will be someone in the future who, after interacting with a language model, thinks it can serve as a replacement for some search engine queries.

    Now in contrast, their claims were not non-sensical when they thought they could replace some search engine queries with language model queries. By that I also mean that they were not non-sensical in two respects. The first is that they arrived at the belief after replacing queries to search engines with queries to language models. Then, after having found value in doing so, they announced a belief congruent with that value, intimately connecting their belief to their senses. The second is that, having arrived at their belief, their belief paid rent in anticipated experience: it successfully predicted three noteworthy events: the internal code red status at Google, you.com's language model as enrichment for search, and bing.com's use of language model as enrichment for search. So it successfully predicted past experiences, successfully predicted current experiences, and may or may not predict future experiences - I think most people who hold this view tend to think there will be some refinements on the current generation of language models, in particular, that they will further refined at query time with relevant data, like fact databases, to help correct for some of the existing approximation error. This is my belief. I anticipate this happening. You can judge whether I anticipate successfully by watching reality. I should note that part of the reason I think this will happen is because I've already seen it happen. So I'm not really making a bold prediction here, but I suspect you will later see this happen, because I've seen this happen.

    Anyway... the belief that language models can replace some search queries is not a non-sensical belief, like beliefs in fairies is, but the belief that no one arrives at a belief that they could be is a non-sensical belief, because people do arrive at such beliefs, and so therefore the belief that they don't is fanciful and not reflective of reality.