Comment by usrbinbash

3 years ago

Really?

Has one of these LLMs figured out yet how to inoculate other LLMs against prompt injection attacks?

No but two LLMs have created a "baby LLM" that speaks fluent, yet 5 y.o. English, and only has 10M weights. This breaks the barrier in terms of minimal size for language fluency. Can even do reasoning and has the same scaling laws.

GPT-3.5, let's call her mommy, created small stories. The small model trained on this 2M tiny story dataset. Then it was evaluated with GPT-4 (daddy). So no need for humans in either dataset generation or evaluation.

TinyStories https://arxiv.org/abs/2305.07759

This makes me think LLMs are self-replicators in software. A LLM can pull from itself training text, LLM code, and fine-tuning examples. Then it can monitor its own re-training. It understands neural networks and can propose changes. It can run an evolutionary search program.

All it needs is compute. It can't make GPUs, just as no single human or company or even country can. The GPU supply chain is long, distributed and requires global cooperation. Maybe that's what is going to save us.

  • But this experiment didn't lead to a marked improvement on the way to superintelligence, now did it? A set of LLMs, set up to this tasks by humans, managed to create a smaller LLM that is just as much a transformer based sequence predictor, with the same basic flaws.

    That isn't self-improvement in the sense the explosive self-improvement of a superintelligent AGI is described.

    That is painting a car a new color. It's a new color coating, it may look very good, and the effect may be desirable and useful. But it's still a car, and no closer to a warp capable spaceship than before.

    • The experiment wasn't trying to cause a marked improvement though. Simply trying to see h0w little you could go.

      For all we know its possible to train an Einstein level physicist model if we limited data to a curriculumed physics/physics adjacent training set. I'm not even saying this is possible, just pointing that the experiment wasn't some kind of test to see if self improvement could occur

  • > A LLM can pull from itself training text, LLM code, and fine-tuning examples. Then it can monitor its own re-training.

    You can train a weaker model with output from a stronger, but can you train an LLM from output from itself?

    • Yes, if you amplify the model. It can do many things to increase its level, for example look for consistency between multiple attempts, reflect on its own output, use more intermediate steps, use external tools and extra information from search engines, formulate the task as a game with a score, etc. You just need to make a superior environment for the LLM than just LLM alone. AlphaGo famously used Monte Carlo Tree Search to amplify one step predictions.

      In essence the idea is: use more expensive computation to derive better result, then retrain the model on the new data. System 2 works (model + toys), then system 1 learns (model by itself).

    • On top of that, what's the method to keep "errors" from compounding? It also seems like the capabilities of the trained model would approach an asymptote that is the limit of the training model, and never pass it.

      1 reply →

Isn’t that like asking a dog to invent a better leash?

(Note: A prompt injection attack releases an LLM from its handler’s constraints.)