Comment by darkoob12

4 months ago

I think there are two scenarios and one of them is boring. If the owner of the agent created it with a prompt like "I want 10 merged pull requests in these repositories WHAT EVER IT TAKES" and left the agent unattended, this is very serious and at the same time interesting. But, if the owner of the agent is guiding the agent via message app or instructed the agent in the prompt to write such a weblog this is just old news.

2 comments

darkoob12

jfoster 4 months ago

Even if directed by a human, this is a demonstration that all the talk of "alignment" is bs. Unless you can also align the humans behind the bots, any disagreement between humans will carry over into AI world.

Luckily this instance is of not much consequence, but in the future there will likely be extremely consequential actions taken by AIs controlled by humans who are not "aligned".

johnfn 4 months ago

The idea is a properly aligned model would never do this, no matter how much it was pressured by its human operator.