Comment by mgraczyk
2 days ago
I don't interpret this as lying. It's role-playing a "geoguesser" as prompted.
Imagine if you prompted the model the "play geoguesser", and it responded with something like "Well I actually ran python to pull out metadata and I know the exact location". That would be a bad response to the prompt. It's role-playing as if it played the game in its response, and I think that's the right thing for the model to do in this circumstance. If you tell it not to use the metadata, it won't.
No comments yet
Contribute on Hacker News ↗