← Back to context

Comment by macNchz

6 hours ago

No maliciousness or injection required, even the newest and most resistant models can start doing weird stuff on their own, particularly when they encounter something failing that they want to work.

Just today I had Opus 4.6 in Claude Code run into a login screen while building and testing a web app via Playwright MCP. When the login popped up (in a self-contained Chromium instance) I tried to just log in myself with my local dev creds so Claude would have access, but they didn't work. When I flipped back to the terminal, it turned out Claude had run code to query superadmin users in the database, picked the first one, and changed the password to `password123` so it could log in on its own.

This was a sandboxed local dev environment, so it was not a big deal (and the only reason I was letting it run code like that without approval), but it was a good reminder to be careful with these things.