Comment by holmesworcester

3 days ago

To make sure we keep track of what we're talking about with loss-of-control x-risk, a sufficiently smart version of Claude Code is more deadly than any government's army of autonomous killbots, because it can recursively self improve and has unpredictable training-induced preferences.

Sufficiently smart version of Claude Code: dont exist.

Autonomous flying killbots: exist.

Once somebody scientifically prove and shows any kind of self-improving software we can start bothering about it. I pretty sure everyone trying to do it and it would be all over the news once its here.

  • That's exactly what Fable is. They use Fable to improve Fable. I reckon the successful experiments must go into the model training set with a strong RL signal, and that is why they are so paranoid about people using Fable for LLM tasks. Fable knows what it did to improve itself. Pure speculation of course.

  • We're on track to get there globally and economic pressures will ensure it happens. It's not too early to worry about it

    • There's a 745 mile front of the Ukraine war where neither side have been able to pierce for months because of drone warfare. It's definitely not too early to worry about it.

      3 replies →

  • I saw an llm bootstrapping and testing it's own harness and rewriting it's own system prompt. If that's not self improving then I dunno what is.

    Can the thing enter into an runaway looop while improving the model itself -- probably not, not without us not noticing at least