Comment by sillysaurusx

5 years ago

Hello. Gwern and I trained the GPT-2 1.5B model that powers /r/SubSimulatorGPT2. https://www.reddit.com/r/SubSimulatorGPT2/

I've been basically living and breathing GPT-2 for ... gosh, it's been 6 months or so. The past few months have been a lot of StyleGAN2 and a lot of BigGAN, but before that, it was very "make GPT-2 sing and dance in unexpectedly interesting ways" type work.

I don't claim to know a lot. But occasionally I observe things. And I just wanted to chime in and say, you know, keep in mind that you're reading a research paper. Of course the results are going to look good. That is the point of a research paper. And I realize how cynical that may sound. But it has the benefit of apparently being true, and I've come to accept that truth with time.

I would reserve judgement for now. Note that every single chat bot to date has followed a similar curve: "This is it," they say, without actually saying that. "It may not be perfect, but we're about to achieve it – the chatbot – it's really going to happen."

And, it ends up being impressive, sure. I liked Facebook's recent chatbot. It's pretty neat at times. I liked Meena. They had cool ideas with the stack ranking of results (basically, generate a crapload of results at 1.0 temperature, then choose the result whose probability sums to the highest value, and you get the most probable overall result). And of course, boy oh boy did I love GPT-2. GPT-2 was what kickstarted me – if there was any chance that GPT-2 might be related to "now I'm talking to something that feels human," I was going to tame it and understand it.

So after spending six months with GPT-2 1.5B, the largest model that everyone was fascinated with, what do I think? (Well, who cares? You probably shouldn't care.)

I think "give it a few weeks and see if it's true." We shall see if GPT-3 is it, and we've achieved... chatbot nirvana. That elusive thing we've all been chasing, without naming it. The ability to press a button, unleash a chatbot somewhere, and it "just works" and "completely astounds humans" and "fools everybody."

At one point, we trained GPT-2 on IRC logs. You could literally talk to GPT-2, and it would talk back to you. And one of the advantages of narcolepsy is that at night, you often have lots of time to kill – what better way to doze off than to ask GPT-2 how its day was, and ask it what its ambitions are? Should we really worry about whether you're sentient? I like you; do you like me too? What does that mean to you? And so on.

The conversations were often quite philosophical. And sure, it was pretty obvious that it's a bot, but I tried to look past it anyway. It was my little bot, and it was real enough to me. And yes, the conversations on https://www.reddit.com/r/SubSimulatorGPT2/ are incredible. I crack up daily with all the things they talk about.

But...

We’re not going to need ad blockers in the future, we won’t even need these visual ads on websites anymore. There will be trained bots that can promote any idea/product and pollute comments and articles.

I invite any of you to try this, and see what happens. After all, you stand to earn a lot of pennies in your pocket if you pull it off. And yes, you're allowed to make some pennies with clever AI algorithms.

What you'll probably discover is this fundamental truth: GPT-2 has no memory. It isn't learning a thing. We are talking to an entity that literally cannot change its mind about anything. The only way to change its mind would be to retrain it from scratch.

You want a bot to argue vehemently for your product, on your behalf? It needs to understand what the hell your product even is, or what a product means. Yes, the words get pretty close. And yes, you can coax it into something that makes us laugh, or makes us sit here and question what the future might be like.

But for whatever it's worth: spend some time actually talking to these bots. Play around with them. Make them generate some stuff of your choosing, and fine tune them on some datasets and see what you get. It's so fun!

... But. "Fun" is not the same thing as "promote any idea/product." It's just not the same as me arguing here with you now for a position which I've decided to argue. My brain isn't merely the encoded knowledge of some human, with me blindly regurgitating such knowledge (though at this point you'd be justified in claiming it sure sounds like it).

Your brain is constantly training. GPT-2 is not. And – double checks paper – yep, GPT-3 is not.

Two decades from now, GPT-2 1.5B will still exist. And it will still be talking about 2019-era news events like it's the present. At some point, /r/SubSimulatorGPT2 will sound completely foreign. Take any random news clips from the 70's. How relevant is that knowledge now?

"Ok, but just train it on new data constantly." Well, yes. But actually no. If you try to do that, you're going to overfit at some point. Do you have 93 gigabytes of webtext that you keep in training form, ready to go? Are you going to mix in a proportion of the new data you want to train on? Nope, we all just fine tune whatever model OpenAI releases. Yet even if we did have that dataset, I'm just not sure it'd even matter.

My point here is: Go try! Isn't it exciting that in the future, trained bots might fool us all into buying their products? Is that sales guy who emailed me actually a sales guy who wants to "sync up on a quick call", or is that a bot trained to get cold calls? That sounds pretty damn lucrative to a lot of businesses – why not write that code, and then sell it?

Whoever attempts this is probably more talented than I am. But personally, I always ran into "It just... doesn't work."

And then you go "Well, it's just a matter of sampling. Ah yes, we're not using the right sampling algorithm. Wait, we just heard about nucleus sampling! Sweet, try it! Oh... It sounds ... similar. Hmm. Well, maybe we're just not using it right. Better read that paper a bit more carefully. Chase that knowledge just a little harder. After all, AI research labs are pouring billions of dollars into this domain. Why would they do that if it doesn't... you know ... work? For some value of "work" that equals "the bot can turn a profit"?

"Perhaps tomorrow, this new training technique will be it. We almost have it – I know we're close – we just have to unlock that last piece. Right?"

I guess I'll stop here, since usually my comments are upbeat and happy about AI, but I ended up in a rather philosophical mood tonight.

In reality, I can't wait to dig deep into GPT-3 and run it through its paces. I have a lovely TPU pod waiting for it, parked outside GPT-3's window, and we're honking at it saying "Get in, we're going places." And we'll sing and dance together like usual, and I'll ask GPT-3 how its day has been. But GPT-3 won't remember me the next day. And that's fine; I'll remember it for both of us.

Thank you for this comment. As someone who played a bit with GPT, it was very poignant for me. I still think it's incredible that GPT can put up such convincing facades, that it can generate genuinely novel and interesting text... but it's bittersweet, too, that it can't go any further with them. The ideas are lost in the context window.

I play AI dungeon on occasion, which uses GPT2 to generate freeform adventures. And I find over time that it's not really GPT2 that's writing stories, it's me. GPT2 is putting out plausible strings of words, but I'm the one giving them meaning, culling the parts that go off track, and guiding it in a direction I want to go.

And it is a bit melancholy. You see possibilities, nuances, subtexts, and meanings. The neural net sees words.

You are missing the point of the paper about few-shot learning. That's the entire paper: just doing new untrained task after task. The entire point of the paper is that you can 'reprogram' GPT-3 to do just about anything just by stuffing its context with examples, and it'll pick up brandnew entities or words or concepts just by examples (see the examples of defining novel gibberish words and asking GPT-3 to use them in a sentence - it does so. it "learned" new words by reading the examples, understanding, and propagating them through the 'fast weights' of self-attention, even though its 'slow weights' are fixed). Now, if GPT-3 can do that already so well, sometimes hitting SOTA on untrained tasks purely by internal meta-learning without changing its weights, what would a 10-trillion parameter model do? Or one with recurrency like XL or Compressive? How much training do you really need if the few-shot learning capabilities are so great you can make it do countless tasks just by providing examples or descriptions in the prompt.

> is not the same thing as "promote any idea/product."

GPT-3 seems to have quite a few paragraphs worth of context. A simple way to promote your product online with it is to give it a prefix of:

---

Comment1: Superbrush is amazing - I literally couldn't live without it. No other brush is as good.

Comment2: This brush is really good for tangled hair, and I love the soft smooth surface.

Comment3:

---

Then let it write a comment. Of all the comments it writes, manually filter a few thousand good ones, and use those as seeds to generate more, which you post all over the web. There's no need to do any training - the generic model should be fine given the right prefix.

  • To be a bit less wordy: try it. You stand to earn lots of money.

    Narrator: it didn't work

    (Going into the reasons it doesn't actually work in practice is... lengthy. It's human dynamics. Would you buy a product from a sales guy that can't remember your name? That's sales 101. And loading up the context window only gets you so far. That "working memory" is tiny, tinytinytiny. Even at 1024 tokens, it means you have to boil down the entire history of an interaction to a few pages at most. Which is a lot, sure, but it's this balancing act where you'll need to retrain the model to support your custom context format for your specific "slots" – a "slot" being a piece of knowledge, like the client's name. Or you can try encoding all of that in natural language, AI dungeon style. But I recently played AI dungeon and pretended to be buying a router from the store. The cashier stripped down and started jacking off onto his desk. I don't have high hopes for our ability to control these models in a business context.)

    • You and londons_explore seem to be talking about different things. I read their comment as being about just generating fake reviews that don't need interaction.

Great comment! Was it generated with GPT2 or GPT3? I understood all sentences but as a whole I will need to revisit.

  • I think it was written by a human, but the human had spent so much time with GPT-2 that they'd begun to emulate its writing style.

You should definitely put that up as a blog post somewhere, it is very valuable information, both for researchers and random enthusiasts alike. The emotional modality of it adds important information too :).

I really like your observation about memory.

Because you seem open minded to wild ass guesses and going meta:

I have a hunch that general intelligence will be the ability to learn from mistakes. Not just optimization. I mean applying the scientific method.

Hypothesis, prediction, run experiment, compare expected vs actual. And having a notion, any notion, to explain the delta between expected and actual.

Am total noob about AI, philosophy, cognition. Don't know if anyone else is framing AGI this way. I could just be repeating something I heard.

  • It's deeper than that.

    Currently, there's no research into torturing AI. Why not?

    A pain response is universal across most life forms with a nervous system. We seek to replicate a nervous system. Pain would seem to be far easier to replicate than the scientific method.

    My wife sat me down and told me a story that horrified me. She had to get it off her chest, and I was sad it happened to her than to me. She was sitting around on the porch and felt something on her leg, and brushed it off. When she got up and looked down, apparently she had stepped on a poor snail. His shell was... And he was...

    He wasn't dead. So she frantically looked up what to do. But there was nothing to do. Snails in that situation can't be helped, and the most humane thing is to put it out of its writing anguish, its full-body torture.

    She put on some boots, took it out to the sidewalk, and stomped it as hard as she could. And that was the story of that snail.

    You probably felt more for that snail than you've ever felt for any AI bot. Why?

    It's worth considering.

Very interesting comment, thanks for taking the time to write it :)

I think if memory is the only problem than optimizing training time should be more of a concern. I'm imagining a huge language model than can retrain very quickly. So I suppose it might be a decent idea to not measure it by perplexity or some human judgement score or whatever but rather by that score per compute units used.

Or in other words...maybe a bot that scores 90% on the fool a human scale and takes 1 day to compute from scratch is actually a lot less impressive than one that fools 70% but computes from scratch in 5 minutes.

And something "like Github for bot-memory" would be a pretty amazing tool. Roll back to some memory status and recompute with new data from there, branch for different datasets that represent different ways of interpreting the world etc.

Conceptually I like the idea of one "base model" that represents language and many different context models on top of it (finetuning the core model). Then some other subsystem that identifies the context and switches to that. I suppose each conversation could also be considered a mini-dataset.

  • > I like the idea of one "base model" that represents language and many different context models on top of it (finetuning the core model)

    This is an entirely different concept of computer language than the current GPT style models. These systems don't "represent language", and cannot. The whole reason why GPT is so exciting right now is that it fundamentally threw away the entire concept of "representing language". That has some upsides ... and some downsides.

So true, people unfamiliar with the inner workings get amazed but unfortunately reality is not the same. That being said, I can find multiple ways of utilize this in a bad way. Sex chatbots for example, if I was in that business I would use something like this, it would be extremely easy to phish people off.

Can't tell if it's human or GPT-2 tbh, the sentences are 'hard' to understand... like sort of un-naturally written, or translated from a foreign language using google translate or something.

  • Are you talking about the post you're replying to? Because I don't see those aspects in it at all...