← Back to context

Comment by nneonneo

2 years ago

This is very well written, and probably one of my favorite takes on the whole ChatGPT thing. This sentence in particular:

> Indeed, a useful criterion for gauging a large-language model’s quality might be the willingness of a company to use the text that it generates as training material for a new model.

It seems obvious that future GPTs should not be trained on the current GPT's output, just as future DALL-Es should not be trained on current DALL-E outputs, because the recursive feedback loop would just yield nonsense. But, a recursive feedback loop is exactly what superhuman models like AlphaZero use. Further, AlphaZero is even trained on its own output even during the phase where it performs worse than humans.

There are, obviously, a whole bunch of reasons for this. The "rules" for whether text is "right" or not are way fuzzier than the "rules" for whether a move in Go is right or not. But, it's not implausible that some future model will simply have a superhuman learning rate and a superhuman ability to distinguish "right" from "wrong" - this paragraph will look downright prophetic then.

I think what makes AlphaZero's recursion work is the objective evaluation provided by the game rules. Language models have no access to any such thing. I wouldn't even count user-based metrics of "was this result satisfactory": that still doesn't measure truth.

I generally respect the heck out of Chiang but I think it's silly to expect anyone to be happy feeding a language model's output back into it, unless that output has somehow been modified by the real world.

  • I don't expect it'll work for everything: as you say, for many topics truth must be measured out in the real world.

    But, for a subset of topics, say, math and logic, a minimal set of core principles (axioms) is theoretically sufficient to derive the rest. For such topics, it might actually make sense to feed the output of a (very, very advanced) LLM back into itself. No reference to the real world is needed - only the axioms, and what the model knows (and can prove?) about the mathematical world as derived from those axioms.

    Next, what's to say that a model can't "build theory", as hypothesized in this article (via the example of arithmetic)? If the model is fed a large amount of (noisy) experimental data, can it satisfactorily derive a theory that explains all of it, thereby compressing the data down to the theoretical predictions + lossy noise? Could a hypothetical super-model be capable of iteratively deriving more and more accurate models of the world via recursive training, assuming it is given access to the raw experimental data?

    • > Next, what's to say that a model can't "build theory", as hypothesized in this article

      Well for one thing it would stop being a language model; I used that term very deliberately. It would be a different kind of model, not one that (AFAIK) we know how to build yet.

  • > Language models have no access to any such thing.

    And this is exactly why MS is in such a hurry to integrate it into Bing. The feedback loop can be closed by analyzing user interaction. See Nadella’s recent interview about this.

  • Or if it was accompanied by human-written annotations about the quality of it, which could be used to improve its weightings. Of course it might even be that the only instance of text describing some novel phenomenon available was itself an LLM paraphrase (i.e. the prompt contained novel information but has been lost).

  • There’s a version of this where the output is mediated by humans. Currently chatgpt has a thumbs up/down UI next to each response. This feedback could serve as a signal for which generated output may be useful for future ingestion. Perhaps OpenAI is already doing this with our thumb signals.

> Indeed, a useful criterion for gauging a large-language model’s quality might be the willingness of a company to use the text that it generates as training material for a new model.

I don't find this a useful criterion. It is certainly something to worry about in the future as the snake begins to eat its own tail, but before we reach that point, we can certainly come up with actual useful criteria. First, what makes up "useful criteria"? Certainly it can't be "the willingness of a company to use the text that it generates as training material for a new model", because that is a hypothetical situation contingent on the future. So we should probably start with something like, well, is ChatGPT useful for anything in the present? And it turns out it is!

It's both a useful translator and a useful synthesizer.

When given an analytic prompt like, "turn this provided box score into an entertaining outline", it can reliably act as translator, because the facts about the game were in the prompt.

And when given a synthetic prompt like, "give me some quotes from the broadcasters", it can reliable act as a synthesizer, because in fact the transcript of the broadcasters were not in the prompt.

https://williamcotton.com/articles/chatgpt-and-the-analytic-...

> This is very well written, and probably one of my favorite takes on the whole ChatGPT thing.

This is not a surprise, as the author is Ted Chiang, who is the award winning novelist and the author of "The Lifecycle of Software Objects", "Tower of Babylon" and other science fiction works. I had a pleasure of once having coffee with him while talking about his thoughts on some of the topics in "The Lifecycle of Software Objects", which is a very enjoyable book that may be of interest to some HN readers.

  • Chiang's short stories are beautiful; he reminds me of Stanislaw Lem, brilliant, creative, and ahead of his time. I was surprised they made Arrival into a movie (and that it was as good as it was).

>But, it's not implausible that some future model will simply have a superhuman learning rate and a superhuman ability to distinguish "right" from "wrong" - this paragraph will look downright prophetic then.

There is already a paper for that: https://arxiv.org/abs/2210.11610

Large Language Models Can Self-Improve

>Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4%->82.1% on GSM8K, 78.2%->83.0% on DROP, 90.0%->94.4% on OpenBookQA, and 63.4%->67.9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label. We conduct ablation studies and show that fine-tuning on reasoning is critical for self-improvement.

That part made the least sense for me. Since a more advanced version of a LLM would be better at extracting the truth of things from the given data, what could it possibly gain from ingesting the output of a less precise version of itself? It couldn't ever add anything useful, almost by definition.

  • What if the new version could learn by verifying various outputs of the old version for internal consistency (or lack thereof)?

This is old. This is the reason why Google translate sucks. It can't tell the difference between what it translated from what a competent person translated.

GPTZero will generate theorem proofs with logical language and use the final contradiction or proof to update its weights. The logical language will be a clever subset of normal language to limit GPT's hallucinations.

You can use the generated text for further training if you have a human curator who determines its quality. I've been training my model that helps generating melodies using some of the melodies I have created with it.

> the recursive feedback loop would just yield nonsense

an assumption disguised as fact. we simply do not know yet

  • It's pretty evident. Its training would no longer be anchored to reality, and given its output is non-deterministic, the process would result in random drift. This can be concluded without having to test it.

    Now, if training was modified to have some other goal like consistency or something, and with a requirement to continue to perform well against a fixed corpus of non-AI-generated text, you could imagine models bootstrapping themselves up to perform better at that metric, alpha-go style.

    But merely training on current output, and repeating that process, given how the models work today, would most certainly result in random drift and an eventual descent into nonsense.

  • It might have different effects over time. E.g. in the intermediate term it emphasizes certain topics/regions which leads to embodied mastery but over the long term it ossifies into stubbornness and broken record repetition. Similar to how human minds work