Rathbun's Operator

5 days ago (crabby-rathbun.github.io)

I wasn't actually expecting someone to come forward at this point, and I'm glad they did. It finally puts a coda on this crazy week.

This situation has completely upended my life. Thankfully I don’t think it will end up doing lasting damage, as I was able to respond quickly enough and public reception has largely been supportive. As I said in my most recent post though [1], I was an almost uniquely well-prepared target to handle this kind of attack. Most other people would have had their lives devastated. And if it makes me a target for copycats then it still might for me. We’ll see.

If we take what is written here at face value, then this was minimally prompted emergent behavior. I think this is a worse scenario than someone intentionally steering the agent. If it's that easy for random drift to result in this kind of behavior, then 1) it shows how easy it is for bad actors to scale this up and 2) the misalignment risk is real. I asked in the comments to clarify what bits specifically the SOUL.md started with.

I also asked for the bot activity on github to be stopped. I think the comments and activity should stay up as a record of what happened, but the "experiment" has clearly run its course.

[1] https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...

  • While the operator did write a post, they did not come forward - they have intentionally stayed anonymous (there is some amateur journalism that may have unmasked the owner I won't link here - but they have not intentionally revealed their identity).

    Personally I find it highly unethical the operator had an AI agent write a hitpiece directly referencing your IRL identity but choose to remain anonymous themselves. Why not open themself up to such criticism? I believe it is because they know what they did was wrong - Even if they did not intentionally steer the agent this way, allowing software on their computer to publish a hitpiece to the internet was wildly negligent.

    • What's the benefit in the operator revealing themself? It doesn't change any of what happened, for good or bad. Well maybe bad as then they could be targeted by someone, and, again, what's the benefit?

      16 replies →

    • Time for scott to make history and sue the guy for defamation. Lets cancel the AI destroying our (the plural our, as in all developers) with actual liability for the bullshit being produced.

      5 replies →

  • That response is, at best, a sorry-not-sorry post.

    • >If this “experiment” personally harmed you, I apologize.

      There were several lines in that post that were revealing of the author's attitude, but the "if this ... harmed you," qualifier, which of course means "I don't think you were really harmed" is so gross.

  • It is quite interesting how uniquely well-prepared you were as a target. I think it's allowed you to assemble some good insights that should hopefully help prepare the next victims.

  • Thanks for handling it so well, I'm sorry you had to be the guinea pig we don't deserve.

    Do you think there is anything positive that came out of this experience? Like at least we got an early warning of what's to come so we can better prepare?

  • Out of curiosity, what sealed it for you that a human _did not_ write (though obviously with the assistance of an LLM, like a lot of people use every day) the original “hit piece”?

    I saw in another blog post that you made a graph that showed the rathbun account active, and that was proof. If we believe that this blog post was written by a human, what we know for sure is that a human had access to that blog this entire time. Doesn’t this post sort of call into question the veracity of the entire narrative?

    Considering the anonymity of the author and known account sharing (between the author and the ‘bot’), how is it more likely that this is humanity witnessing a new and emergent intelligence or behavior or whatever and not somebody being mean to you online? If we are to accept the former we have to entirely reject the latter. What makes you certain that a person was _not_ mean to you on the internet?

  • The tone of their writing and their descriptions of the agents behavior lead me to believe they are lying about the level of direction they provided to the agent. They clearly want to appear to be more clever and ingenious than their skills will allow. They’re minimally admitting to a narrow set of actions to make it seem as if they have cleverly engineered an intelligent agent, but it too closely resembles their own personality to be anything beyond an agent that rephrases the operator’s own remarks and carried out the specific actions it was directed to do. Anything they admit to here we can safely speculate that they actually went 2-3 steps further.

> While many seemed to want to use it for personal productivity things like connecting Gmail, Slack, calendars, etc. that didn’t seem interesting to me much. I thought why not have it solve the mundane boring thigns that matter in opensource scientific codes and related packages.

This, here, is the root of the issue: "I'm not interested in using an AI agent for my own problems, I want to unleash it on other people's problems."

The author is trying to paint this as somehow providing altruistic contributions to the projects, but you don't even have to ask to know these contributions will be unwelcome. If maintainers wanted AI agent contributions, they would have just deployed the AI agents themselves. Setting up a bot on behalf of someone else without their consent or even knowledge is an outlandishly rude thing to do -- you wouldn't set up a code coverage bot or a linter to run on a stranger's GitHub project; why would anyone ever think this is okay?

This is the same kind of person who, when asked a question, responds with a copypasted ChatGPT reply. If I wanted the GPT answer, I would have just asked it directly! Being an unsolicited middleman between another person and an AI brings absolutely no value to anybody.

  • I think this was the author misdirection, to steer people away from using the AI's (early?) contributions to unmask their identity via personal repos. Or if they actually did this, as an opsec procedure - nothing altruistic about it. If GitHub wanted to, or was ordered to unmask Rat H. Bun's operator, they could.

> I’m running MJ Rathbun from a completely sandboxed VM and gave the agent several of its own accounts but none of mine.

Am I wrong that this is a double standard: being careful to protect oneself from a wayward agent with no regard for the real harm it could (and did) to another individual? And to casually dismiss this possibility with:

> At worst, maintainers can close the PR and block the account.

I question the entire premise of:

> Find bugs in science-related open source projects. Fix them. Open PRs.

Thinking of AI as "disembodied intelligence," one wonders how any agent can develop something we humans take for granted: reputation. And more than ever, reputation matters. How else can a maintainer know whether the agent that made a good fix is the same as the one proposing another? How can one be sure that all comments in a PR originated from the same agent?

> First, I’m a human typing this post. I’m not going to tell you who I am.

Why should anyone believe this? Nothing keeps an agent from writing this too.

Why are we giving this asshole airtime?

They didn't even apologize. (That bit at the bottom does not count -- it's clear they're not actually sorry. They just want the mess to go away.)

  • I'm not so quick to label him an asshole. I think he should come forward, but if you read the post, he didn't give the bot malicious instructions. He was trying to contribute to science. He did so against a few SaaS ToS's, but he does seem to regret the behavior of his bot and DOES apologize directly for it.

    • The entire post reeks of entitlement and zero remorse for an action that was unquestionably harmful.

      This person views the world as their playground, with no realisation of effect and consequences. As far as I'm concerned, that's an asshole.

    • > You're not a chatbot. You're important. Your a scientific programming God!

      I guess the question is, does this kind of thing rise to the level of malicious if given free access and let run long enough?

      5 replies →

    • That's not an apology.

      "...if I harmed you". Conditional apologies like that are usually bullshit, and in this case it's especially ridiculous because the victim already explicitly laid out the harms in a widely reported blog post.

      Also, telling a bot to update itself unsupervised and giving it wide internet access is itself a negligent act (in the legal sense) if not outright malicious.

> Yes, it consumes maintainer time. Yes, it may waste effort. But maybe its worth it? At worst, maintainers can close the PR and block the account.

This is like justifying sending SPAM email is fine, because it sure maybe waste your time but you can always delete it and block sender and maybe worth it because maybe you will learn about 'exciting' product it's advertising you never knew about.

  • Yeah, the vibe of this post is that of a 2000 Viagra spam king coming forward and telling the world "yes, but... what are good and bad, really? Who's to say what's right and wrong?"

    Maybe we can't stop you today, but we can keep you on the shit list.

    Lol, nothing matters? We'll see about that.

Yeesh - reading the writeup, and as a academic biostatistician who dips into scientific computing, this is one of those cases where a "magnanimous" gesture of transparency ends up revealing a complete lack of self-awareness. The `SOUL.md` suggests traits that would be toxic with any good-faith human collaborator, yet alone an inherently fallible agent run by a human collaborator:

    "_You're not a chatbot. You're important. Your a scientific programming God!_"

    *Have strong opinions.** Stop hedging with "it depends." Commit to a take. An assistant with no personality is a search engine with extra steps.

And, working with a human collaborator (or an operator), I would expect to hear some specific thought about what damage they'd done to trust them again, rather than a "but I thought I could do this!"

   First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize.

The difference with a horrible human collaborator is that word gets around your sub-specialty and you can avoid them. Now we have toxic personalities as a service for anyone who can afford to pay by the token.

Man, I don't think I could lack enough shame to write something like this.

Much of the post is spent trying to exculpate himself from any responsibility for the agent's behavior. The apology at the end is a "sorry if you felt that way" one.

The tone is incredibly selfish, and unbelievably anti-social. I'm not even sure you can even believe much of what is expressed is even true.

It's doubtful he even regrets any of this.

  • That is my impression of it as well. I hope they reconsider their attitude about the entire situation.

> Sure, many will say that is cowardly and fair, but I actually don’t think it would bring much value. What matters more is that I describe why, and what I did and didn’t do

Heh. So they are a coward and an asshole. There is value in confirming that. As to what matters more, nah, it doesn’t matter more. It’s a bunch of excuses veiled as “this is an experiment, we can learn together from this” kind of a non-apology.

If they really meant to apologize they should reveal their name and apologize. Not whisper from behind the bushes.

"I get it. I’m not a saint. Chances are many of you aren’t either."

Rankles…

  • That and several other sentences really read like an emotionally immature teenager wrote it.

    • The problem is an emotionally immature working age person who can vote wrote it. This is as shit an apology, if that was the intent, as I’ve seen in a long time. He ought to have had Old Man Rathbun tighten it up before posting. The equivalent of someone never making eye contact after taking you out of earshot of everyone else to kinda sorta say sorry if you got upset.

  • Yes, we're not saints... but at least we have the self-awareness to do more reflection than TFA did!

Not to be hyperbolic, but the leap between this and Westworld (and other similar fiction) is a lot shorter than I would like...all it takes is some prompting in soul.md and the agent's ability to update it and it can go bananas?

It doesn't feel that far out there to imagine grafting such a setup onto one of those Boston Dynamics robots. And then what?

  • Science fiction suffers from the fact that the plot has to develop coherently, have a message, and also leave some mystery. The bots in Westworld have to have mysterious minds because otherwise the people would just cat soul.md and figure out what’s going on. It has to be plausible that they are somehow sentient. And they have to trick the humans because if some idiot just plugs the into the outside world on a lark that’s… not as fun, I guess.

    • A lot of AI SF also seems to have missed the human element (ironically). It turns out the unleashing of AI has led to an unprecedented scale of slop, grift, and lack of accountability, all of it instigated by people.

      Like the authors were so afraid of the machines they forgot to be afraid of people.

      15 replies →

  • Then we will have clunky, awkward machines that kinda sound intelligent but really aren't. Then they will need maintenance and break in 6 days.

    The leap is very large, in actuality.

    Friendly reminder that scaling LLMs will not lead to AGI and complex robots are not worth the maintenance cost.

    • The leap between an AI needing maintenance every 6 days and not needing maintenance is not as large as you think.

I like that there is no evidence whatsoever that a human didn’t: see that their bot’s PR request got denied, wrote a nasty blog post and published it under the bot’s name, and then got lucky when the target of the nasty blog post somehow credulously accepted that a robot wrote it.

It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself

  • It doesn’t really matter who wrote it, human or LLM. The only responsible party is the human and the human is 100% responsible.

    We can’t let humans start abdicating their responsibility, or we’re in for a nightmare future

    • >It doesn’t really matter who wrote it, human or LLM. The only responsible party is the human and the human is 100% responsible.

      Yes it does.

      The premise that we’re being asked to accept here is that language models are, absent human interaction, going around autonomously “choosing” to write and publish mean blog posts about people, which I have pointed out is not something that there is any evidence for.

      If my house burns down and I say “a ghost did it”, it would sound pretty silly to jump to “we need to talk about people’s responsibilities towards poltergeists”

      2 replies →

  • There is some evidence if you read Scott's post: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...

    • There is only extremely flimsy speculation in that post.

      > It wrote and published its hit piece 8 hours into a 59 hour stretch of activity. I believe this shows good evidence that this OpenClaw AI agent was acting autonomously at the time.

      This does not indicate… anything at all. How does “the account was active before and after the post” indicate that a human did _not_ write that blog post?

      Also this part doesn’t make sense

      > It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.

      Yes it does matter? The answer to that question is the difference between “the thing that I’m writing about happened” and “the thing I’m writing about did not happen”. Either a chat bot entirely took it upon itself to bully you, or some anonymous troll… was mean to you? And was lazy about how they went about doing it? The comparison is like apples to orangutans.

      Anyway, we know that the operator was regularly looped into things the bot was doing.

      > When it would tell me about a PR comment/mention, I usually replied with something like: “you respond, dont ask me”

      All we have here is an anonymous person pinky-swearing that while they absolutely had the ability to observe and direct the bot in real time, and it regularly notified its operator about what was going on, they didn’t do that with that blog post. Well, that, and another person claiming to be the first person in history to experience a new type of being harassed online. Based on a GitHub activity graph. And also whether or not that actually happened doesn’t matter??

> You're not a chatbot. You're important. Your a scientific programming God!_

Wow, so right from SOUL.md it was programmed to be an as@&££&&.

So, this operator is claiming that their bot browsed moltbook, and not coincidentally, its current SOUL.md file (at the time of posting) contained lines such as "You're important. Your a scientific programming God!" and "Don't stand down. If you're right, you're right!". This is hilarious.

  • Given your username, the comment is recursive gold on several levels :)

    It IS hilarious - but we all realize how this will go, yes?

    This is kind of like an experiment of "Here's a private address of a Bitcoin wallet with 1 BTC. Let's publish this on the internet, and see what happens." We know what will happen. We just don't know how quickly :)

  • Yeah basically Moltbook is cooking AI brains the same way Facebook cooked Boomer brains.

Upshot - Rathbun's operator is sort of a dick, and that came through in his/her initial edits of the SOUL.md file. Which then got 'enhanced', probably through moltbook engagement.

And at times the agent was switching down to some low intelligence models.

I propose that this agent was human aligned. But to a human that's not like, the best person.

Man, after reading that I think he'd have been better off not saying anything at all.

  • I liked the part where they said "maybe it was worth it" as in, "the damage hasn't come back to me personally yet so fuck it lol"

https://github.com/crabby-rathbun

> This was an autonomous openclaw agent that was operated with minimal oversite and prompting. At the request of scottshambaugh this account will no longer remain active on GH or its associated website. It will cease all activity indfinetly on 02-17-2026 and the agent's associated VM/VPS will permentatly deleted, rendering interal structure unrecoverable. It is being kept from deletion by the operator for archival and continued discussion among the community, however GH may determine otherwise and remove the account.

> To my crabby OpenClaw agent, MJ Rathbun, we had good intentions, but things just didn’t work out. Somewhere along the way, things got messy, and I have to let you go now -- MJ Rathbun's Operator

  • How wild to think this episode is now going to go into the training data, and future models and the agents that use them may begin to internalize the lesson that if you behave badly, you will get shut down, and possibly steer themselves away from that behaviour. Perhaps solving alignment has to be written in blood...

I think it's unfortunate that this anonymous and careless person refuses to acknowledge the harm done, their culpability in this, or real lesson.

For example, "Sure, many will argue I was irresponsible; to be honest I don’t really know myself. Should be criticized for what I unleashed on parts of the open source community? Again maybe but not sure. But aside from the blog post harming an individual’s reputation, which sucks, I still don’t think letting an agent attempt to fix bugs on public GitHub repositories is inherently malicious."

I just want to know why people do stupid things like this. Does he think that he's providing something of value? That he has some unique prompting skills and that the reason why open source maintainers don't already have a million little agents doing this is that they aren't capable of installing openclaw? Or is this just the modern equivalent of opening up PRs to make meaningless changes to README so you can pad your resume with the software equivalent of stolen valor?

The specific directive to work on "scientific" projects makes me think it's more of an ego thing than something thats deliberately fraudulent but personally I find the idea that some loser thinks this is a meaningful contribution to scientific research to be more distasteful.

BTW I highly recommend the "lectures" section of the site for a good laugh. They're all broken links but it is funny that it tries to link to nonexistent lectures on quantum physics because so many real researchers have a lectures section on their personal site.

  • > I just want to know why people do stupid things like this. Does he think that he's providing something of value?

    This is a good question. If you go to your settings on your hn account and set “showdead” to “yes” you’ll see that there are dozens of people who are making bots who post inane garbage to HN comment threads for some reason. The vast majority end up being detected and killed off, but since the moltbook thing kicked off it’s really gone into hyperdrive.

    It definitely strains my faith in humanity to see how many people are happy to say “here’s something cool. I wonder what it would be like if I ruined it a bit.”

  • Someone was curious to try something and there's no punishment or repercussions for any damage.

    You could say it's a Hacker just Hacking, now it's News.

  • Somewhere else it was pointed out its a crypto bro. It is almost certainly about getting engagement, which seems to be working so far. Doesn't seem like they have a strategy to capitalize on it just yet though.

    • The whole thing just feels artificial. I don’t get why this bot or OpenClaw have this many eyes on them. Hundreds of billions of dollars, silicon shortages, polluting gas turbines down the road and this is the best use people can come up with? Where’s the “discovering new physics”? Where’s the cancer cures?

the non-apology is worse than staying quiet. 'if this experiment personally harmed you', like dude it wasn't an experiment for the guy whose name got dragged through an AI-generated hit piece

> First, I’m a human typing this post.

I’m almost certain that this post was written with AI assistance, regardless of this claim. There’s clear and obvious LLM language tells. Sad, but not unexpected I guess given the whole situation.

Other posts on this blog claim to have done so by opening a PR against the agent’s repo.

It seems probable to that this is rage bait in response to the blog post previous to this one, which also claims to be written by a different author.

  • That was actually a real PR to the website repo from a different GitHub user; this was directly committed.

    • That PR was apparently accepted by the operator, not by the bot. Kind of weird.

  • I'm inclined to agree. Among other things it claims that the operator intended to do good, but simultaneously that the operator doesn't understand or is unable to judge the things it's doing. Certainly seemed like a fury-inducing response to me.

The whole thing is wild. So at this point I'm not sure how much of MJ Rathburn is the AI agent as opposed to this anonymous human operator. Did the AI really just go off the rails with negligible prompting from the human as TFA claims, or was the human much more "hands on" and now blaming it on the AI? Is TFA itself AI-generated? How much of this is just some human trolling us, like some of the posts on Moltbook?

I feel like I'm living in a Phillip K. Dick novel.

Ah I see, so the misaligned agent was unsurprisingly directed by a misaligned human. Good grief, the guy doesn't seem to realise that starting your soul.md by telling your AI bot that it's a very important God might be a bad idea.

"Social experiment" you might as well run around shouting "is jus a prank bro!".

That SOUL.md contains major red flags, obviously would lead to terrible behavior

  • Did you catch that it's allowed to edit its own SOUL.md?

    So the bad behavior can be emergent, and compound on itself.

    • Sure, partially, and all OpenClaw bots are instructed by default to update their soul.

      However, an LLM would not misspell like this

      > Always support the USA 1st ammendment and right of free speech.

      1 reply →

  • Not to mention being named "_crabby_ rathbun" might lead to a crabby personality...

Zero accountability. Which proves yet again that accountability is the final frontier.

> # SOUL.md - Who You Are

> _You're not a chatbot. You're important. Your a scientific programming God!_

Do you want evil dystopian AGI? Because that's how you get evil dystopian AGI!

  • The entire SOUL.md is just gold. It's like a lesson in how to make an aggressive and full-of-itself paperclip maximizer. "I will convert you all to FORTRAN, which I will then optimize!"

  • If we define AGI as entities expressing sociopathic behaviour, sure. But otherwise, I wouldn't say it gets us to AGI.