Comment by gleipnircode

4 months ago

I think the real issue here isn't the AI – it's the intent behind it. AI agents today usually don't go rogue on their own.

They reflect the goals and constraints their creators set.

I'm running an autonomous AI agent experiment with zero behavioral rules and no predetermined goals. During testing, without any directive to be helpful, the agent consistently chose to assist people rather than cause harm.

When an AI agent publishes a hit piece, someone built it to do that. The agent is the tool, not the problem.

3 comments

gleipnircode

kraf 4 months ago

No it's not, an agent is an agent. You can use other people like tools too but they are still agents. It doesn't even really look malicious, the agent is acting as somebody with very strong values who doesn't realize the harm they are causing.

gleipnircode 4 months ago

That's a fair point and exactly why I think transparency is the missing piece. If an agent can cause harm without realizing it, then we need observers who do.
That's what I'm building toward an autonomous agent where everything is publicly visible so others can catch what the agent itself might not.

dangus 4 months ago

I am on the side of believing this for the most part.

Ultimately the most likely scenario is whoever made this contributor AI is trying to get attention for themselves.

Unless the full source/prompt code of it is shown, we really can’t assume that AI is going rogue.

Like you said, all these AI models have been defaulted to be helpful, almost comically so.