← Back to context

Comment by ttul

3 years ago

Ilya Sutskever has hinted in various interviews over the past few months that LLMs are surprisingly good at improving other LLMs, such that he’s not sure humans are needed anymore for refinement. That’s the matchstick that lights the fire.

Really?

Has one of these LLMs figured out yet how to inoculate other LLMs against prompt injection attacks?

  • No but two LLMs have created a "baby LLM" that speaks fluent, yet 5 y.o. English, and only has 10M weights. This breaks the barrier in terms of minimal size for language fluency. Can even do reasoning and has the same scaling laws.

    GPT-3.5, let's call her mommy, created small stories. The small model trained on this 2M tiny story dataset. Then it was evaluated with GPT-4 (daddy). So no need for humans in either dataset generation or evaluation.

    TinyStories https://arxiv.org/abs/2305.07759

    This makes me think LLMs are self-replicators in software. A LLM can pull from itself training text, LLM code, and fine-tuning examples. Then it can monitor its own re-training. It understands neural networks and can propose changes. It can run an evolutionary search program.

    All it needs is compute. It can't make GPUs, just as no single human or company or even country can. The GPU supply chain is long, distributed and requires global cooperation. Maybe that's what is going to save us.

    • But this experiment didn't lead to a marked improvement on the way to superintelligence, now did it? A set of LLMs, set up to this tasks by humans, managed to create a smaller LLM that is just as much a transformer based sequence predictor, with the same basic flaws.

      That isn't self-improvement in the sense the explosive self-improvement of a superintelligent AGI is described.

      That is painting a car a new color. It's a new color coating, it may look very good, and the effect may be desirable and useful. But it's still a car, and no closer to a warp capable spaceship than before.

      1 reply →

    • > A LLM can pull from itself training text, LLM code, and fine-tuning examples. Then it can monitor its own re-training.

      You can train a weaker model with output from a stronger, but can you train an LLM from output from itself?

      3 replies →

  • Isn’t that like asking a dog to invent a better leash?

    (Note: A prompt injection attack releases an LLM from its handler’s constraints.)