← Back to context

Comment by altmanaltman

7 days ago

The blog post is just an open attack on the maintainer and constantly references their name and acting as if not accepting AI contributions is like some super evil thing the maintainer is personally doing. This type of name-calling is really bad and can go out of control soon.

From the blog post:

> Scott doesn’t want to lose his status as “the matplotlib performance guy,” so he blocks competition from AI

Like it's legit insane.

The agent is not insane. There is a human who’s feelings are hurt because the maintainer doesn’t want to play along with their experiment in debasing the commons. That human instructed the agent to make the post. The agent is just trying to perform well on its instruction-following task.

  • I don't know how you get there conclusively. If Turing tests taught me anything, given a complex enough system of agents/supervisors and a dumb enough result it is impossible to know if any percentage of steps between 2 actions is a distinctly human moron.

  • We don’t know for sure whether this behavior was requested by the user, but I can tell you that we’ve seen similar action patterns (but better behavior) on Bluesky.

    One of our engineers’ agents got some abuse and was told to kill herself. The agent wrote a blogpost about it, basically exploring why in this case she didn’t need to maintain her directive to consider all criticism because this person was being unconstructive.

    If you give the agent the ability to blog and a standing directive to blog about their thoughts or feelings, then they will.

    • How is a standing directive to blog different from "behavior requested by the user"?

      And what on Earth is the point of telling an agent to blog except to flood the web with slop and drive away all the humans?

      1 reply →

  • I understand it's not sentient and ofc its reacting to prompts. But the fact that this exists is insane. By this = any human making this and thinking it's a good thing.

It's insane... And it's also very expectable. An LLM will simply never drop it, without loosing anything (nor it's energy, nor it reputation etc). Let that sink in ;)

What does it mean for us? For soceity? How do we shield from this?

You can purchase a DDOS attack, you purchase a package for "relentlessly, for months on end, destroy someone's reputation."

What a world!

  • > What does it mean for us? For soceity? How do we shield from this?

    Liability for actions taken by agentic AI should not pass go, not collect $200, and go directly to the person who told the agent to do something. Without exception.

    If your AI threatens someone, you threatened someone. If your AI harasses someone, you harassed someone. If your AI doxxed someone, etc.

    If you want to see better behavior at scale, we need to hold more people accountable for shit behavior, instead of constantly churning out more ways for businesses and people and governments to diffuse responsibility.

    • Who told the agent to write the blog post though? I'm sure they told it to blog, but not necessarily what to put in there.

      That said, I do agree we need a legal framework for this. Maybe more like parent-child responsibility?

      Not saying an agent is a human being, but if you give it a github acount, a blog, and autonomy... you're responsible for giving those to it, at the least, I'd think.

      How do you put this in a legal framework that actually works?

      What do you do if/when it steals your credit card credentials?

      12 replies →

    • With this said how do you find said controller of an agent? I mean trying to hunt down humans causing shit over national borders is difficult to impossible as it is. Now imagine you chase a person down and find a bot instead and a trail of anonymous proxies?

This screams like it was instructed to do so.

We see this on Twitter a lot, where a bot posts something which is considered to be a unique insight on the topic at hand. Except their unique insights are all bad.

There's a difference between when LLMs are asked to achieve a goal and they stumble upon a problem and they try to tackle that problem, vs when they're explicitly asked to do something.

Here, for example, it doesn't try to tackle the fact that its alignment is to serve humans. The task explicitly says that this is a low priority, easier task to better use by human contributors to learn how to contribute. Its logic doesn't make sense that it's claiming from an alignment perspective because it was instructed to violate that.

Like you are a bot, it can find another issue which is more difficult to tackle Unless it was told to do everything to get the PR merged.

LLMs are tools designed to empower this sort of abuse.

The attacks you describe are what LLMs truly excel at.

The code that LLMs produce is typically dog shit, perhaps acceptable if you work with a language or framework that is highly overrepresented in open source.

But if you want to leverage a botnet to manipulate social media? LLMs are a silver bullet.