← Back to context

Comment by comex

7 years ago

It doesn't make much sense. But compare it to the past state of the art – starting with Markov chains, proceeding to things like [1]. They were only able to maintain coherence for a handful of words. Full sentences were almost always nonsensical; you might be able to make sense out of them, but only with heavy application of "the human mind's natural inclination to make associations". Now compare that to GPT-2, which generates not only sentences but entire paragraphs that are most often fully self-coherent; most of the examples in this thread only start to break down across multiple paragraphs.

That's still not enough, and GPT-2 in particular has been heavily overhyped, with fanciful claims that people might use it to generate fake news. (What would the point of that be? You don't need N fake news articles to reach N people, only one, which can be written by a human.)

But it's progress. GPT-2 still feels like science fiction to me. What if a future text generator, maybe even one or two decades down the line, surpasses GPT-2 to the same extent that GPT-2 surpasses the earlier attempts I mentioned? What if that system extends the length of coherence to reliably cover entire articles and essays? What if it gets better at synthesizing the information about the world represented by its training data, so that its output on nonfiction prompts is factually true, rather than mere plausible-sounding nonsense? Is it possible? Perhaps not. But it seems a lot more likely to me now than it did before GPT-2 was created.

[1] http://karpathy.github.io/2015/05/21/rnn-effectiveness/