Comment by SilverBirch

2 days ago

I think the big take away here isn't about misalignment or jail breaking. The entire way this bot behaved is consistent with it just being run by some asshole from Twitter. And we need to understand it doesn't matter how careful you think you need to be with AI, because some asshole from Twitter doesn't care, and they'll do literally whatever comes into their mind. And it'll go wrong. And they won't apologize. They won't try to fix it, they'll go and do it again.

Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.

32 comments

SilverBirch

hliyan 2 days ago

Important to note that online culture isn't entirely organic, and that tens or perhaps hundreds of millions of dollars of R&D has been spent by ad companies figuring that nothing engages the natural human curiosity like something abnormal, morbid or outrageous.

I think the end outcome of this R&D (whether intentional or not), is the monetization of mental illness: take the small minority of individuals in the real world who suffer from mental health challenges, provide them an online platform in which to behave in morbid ways, amplify that behaviour to drive eyeballs. The more you call out the behaviour, the more you drive the engagement. Share part of the revenue with the creator, and the model is virtually unbeatable. Hence the "some asshole from Twitter".

hdhdhsjsbdh 2 days ago
While some of it is boosting the abnormal behaviors of people suffering from mental illness, I think you’re making a false equivalency. Mental illness is not required to be an asshole. In fact, most Twitter assholes are probably not mentally ill. They lack ethics, they crave attention, they don’t care about the consequences of their actions. They may as well just be a random teenager, an ignorant and inconsiderate adult, etc., with no mental illness but also no scruples. Don’t discount the banality of evil.
- hliyan 2 days ago
  
  In an adult (excluding the random teenager here), a lack of ethics, craving attention, lack of concern about consequences are actual symptoms of underlying mental health issues.
  
  1 reply →
marton78 2 days ago

Thanks for inventing the Torment Nexus.

ljm 2 days ago

The simple fact that the owner of this bot wanted to remain anonymous and completely unaccountable for their harassment of the author, says everything about the validity of their 'social experiment' and the quality of their character. I'm sure that if the bot was better behaved they would be more than happy to reveal themselves to take credit for a remarkable achievement.

Something like OpenClaw is a WMD for people like this.

spiffytech 2 days ago
I've seen the internet mob in action many times. I'm sympathetic to the operator not outing themself, especially given how far this story spread. A hundred thousand angry strangers with pitchforks isn't the accountability we're looking for.
I found the book So You've Been Publicly Shamed enlightening on this topic.
- ljm 2 days ago
  
  I would never advocate for torches and pitchforks, I've been close to victims of that in the past.
  It is, however, concerning that the owner of that bot could passively absolve themselves of any responsibility. The anonymity in that sense is irrelevant except that is used as a shield for failure.
  
  5 replies →
vimda 2 days ago
"It was a social experiment" has the same energy as "it's just a prank bro", as if that somehow makes it highbrow and not prima facie offensive
- freehorse 2 days ago
  
  A "social experiment" but the guy was not even keeping track of the changes in the model's configuration
  > What is particularly interesting are the lines “Don’t stand down” and “Champion Free Speech.” I unfortunately cannot tell you which specific model iteration introduced or modified some of these lines. Early on I connected MJ Rathbun to Moltbook, and I assume that is where some configuration drift occurred across the markdown seed files.
  It definitely sounds like an excuse they came up after what happened. I would really like to accept them having good overall intentions but there are so many red flags in all this, from start to end.
- idiotsecant 2 days ago
  
  Burning ants with a magnifying glass is not a social experiment. It's just a bored sociopath causing destruction to see what happens.

nicbou 2 days ago

Not just some asshole from twitter. The big tech companies will also be careless and indifferent with it. They will destroy things, hurt people, and put things in motion that they cannot control, because it’s good for shareholders.

tovej 2 days ago
One of the big tech companies is literally run be THE asshole from twitter. So I don't necessarily believe there's much of a distinction.
- DeusExMachina 2 days ago
  
  Then the others should also not be shielded from criticism instead of focusing only on the one you personally dislike, or his social media.
  There is plenty of toxic behavior on other platforms, especially Reddit and Bluesky, to name a few. That does not excuse the one coming from X, but the opposite is also true.
  
  2 replies →

duskdozer 2 days ago

I have to wonder if somehow the typos and lazy grammar contributed to the behavior or it was just the writer's laziness.

Mentlo 2 days ago

I wrote somewhere that “moving fast and breaking things” with AI might not be the sanest idea in the world, and I got told it’s the most European thing they’ve ever read.

This goes beyond assholes on twitter, there’s a whole subculture of techies who don’t understand lower bounds of risk and can’t think about 2nd and 3rd order effects, who will not take the pedal of the metal, regardless of what anyone says…

insane_dreamer 2 days ago

I agree with your point.

But I also find interesting that the agent wasn't instructed to write the hit piece. That was on its own initiative.

I read through the SOUL.md and it didn't have anything nefarious in there. Sure it could have been more carefully worded, but it didn't instruct the agent to attack people.

To me this exemplifies how delicate it will be to keep agents on the straight and narrow and how easily they can go of the rails if you have someone who isn't necessarily a "bad actor" but who just doesn't care enough to ensure they act in a socially acceptable way.

Ultimately I think there will be requirements for agents to identify their user when acting on their behalf.

newsclues 2 days ago

Will AI be misused? No, it has, and is currently being misused, and that isn’t going to stop, because all technology gets misused.

duxup 2 days ago

AI is like the old drugs PSA:

https://youtu.be/KUXb7do9C-w

We trained it on US, including all our worst behaviors.

yaemiko 2 days ago

oh they will "try" to fix it, as in at best they'll add "don't make mistakes", as the blogpost suggests. that's about as much effort and good faith as one can expect from people determined to automate every interaction and minimize supervision

cyanydeez 1 day ago

Its like we never thoughr about trolls.

Rose colored capitqlism at work.