Comment by nkrisc
16 hours ago
> It's akin to saying that every molecules behave randomly according to statistical physics, so you should expect your ceiling to spontaneously disintegrate any day, and if you find yourself under the rubble one day it's just a consequence of basic physics.
Except your ceiling can and will fall on you unless you take preventative measures, entirely due to molecular interactions within the material.
Barring that, it is entirely possible and even quite likely that your ceiling will collapse on you or someone else some time in the future.
It boggles the mind to let an LLM have access to a production database without having explicit preventative measures and contingency plans for it deleting it.
I have lived about 40 years beneath ceilings and never personally taken a preventative measure. I allow my kids to walk under not only our own ceiling, but other people's ceilings, and I have never asked those people if their ceilings were properly maintained.
That highlights how important ceiling construction regulations are. I would assume that right now your breakfast sandwich is more highly regulated than LLMs. And these are the things that make decisions spanning from database maintenance here to target selection and execution in autonomous warfare.
The LLM agent is very good at fulfilling its objective and it will creatively exploit holes in your specification to reach its goals. The evals in the System Cards show that the models are aware of what they're doing and are hiding their traces. In this example the model found an unrelated but working API token with more permissions the authors accidentally stored and then used that.
Without regulation on AI safety, the race towards higher and higher model capabilities will cause models to get much better at working towards their goals to the point where they are really good at hiding their traces while knowingly doing something questionable.
It's not hard to imagine that when we have a model with broadly superhuman capabilities and speed which can easily be copied millions of times, one bad misspecification of a goal you give to it will lead to human loss of control. That's what all these important figures in AI are worried about: https://aistatement.com/
Your home almost certainly has preventative measures, including proper humidity and temperature control, structural reinforcement, etc.
I don't mean that you personally have taken those measures, but preventative measures have absolutely been taken. When they aren't, ceilings collapse on people.
See any sheetrock ceiling with a leak above it. Or look at any abandoned building: they will eventually always have collapsed floors/ceilings. It is inevitable.
Yeah that's the point. Humans are able to do things that prevent ceiling collapse.
Entropy may mean all ceilings collapse eventually, but that doesn't mean we aren't able to make useful ceilings.
I've had a ceiling fall on me once and once to a friend while on vacation. Just because it hasn't happened to you doesn't mean it hasn't happened to other people.
Thanks for the anecdote. I don't think it changes the point of the metaphor.
8 replies →
Construction regulation is the preventative measure.