Comment by _pdp_
1 day ago
If it is that dangerous as they make it appear to be, 24h does not seem sufficient time. I cannot accept this as a serious attempt.
1 day ago
If it is that dangerous as they make it appear to be, 24h does not seem sufficient time. I cannot accept this as a serious attempt.
Agreed. I've been running autonomous LLM agents on daily schedules for weeks. The failure modes you worry about on day one are completely different from what actually shows up after the agents have history and context. 24 hours captures the obvious stuff.
24 h before general internal access seems fine. They don’t have general external access.
Time doesn't mean much, what is important is what they did in this 24h. If all they did was talk about it then it could be 1000 years and it wouldn't matter. What are the safety checks in place?
Do they have a honey pot infrastructure to launch the model in first and then wait to see if it destroys it? What they did in the 24h matters.
Well, just prompt it to fix the issue!
/s