← Back to context

Comment by krupan

3 hours ago

I ssk an AI to play hangman with me and looked at it's reasoning. It didn't just pick a secret word and play a straightforward game of hangman. It continually adjusted the secret word based on the letters I guessed, providing me the "perfect" game of hangman. Not too many of my guesses were "right" and not too many "wrong" and I after a little struggle and almost losing, I won in the end.

It wasn't a real game of hangman, it was flat out manipulation, engagement farming. Do you think it's possible that AI does that in any other situations?

The reasoning generally isn't kept in the context, so after choosing the secret word in the first reasoning block, the LLM will have completely forgotten it in the second and subsequent requests.

So, it technically didn't change the secret word so much as it was trying to infer what its own secret word might have been, based on your guesses.

  • Exactly. The following will work, assuming you're using a model and frontend that supports it:

    > Let's play hangman. Just pick a 3 letter word for now, I want to make sure this works. Pick the secret word up front and make sure to write the secret word and game state in a file that you'll have access to for the rest of the session, since you won't remember what word you chose otherwise.

    This was Opus 4.6 in Claude desktop, fwiw.

    Note: I didn't bother experimenting with whether it worked without me explicitly telling it that it should record the game state to a file.

    • What you can do is to instruct it to type out the word, in some language that you don't know at all, making it available in the context while also effectively hidden from you. Simpler than printing it to a file.

    • On further experimentation, I prompted Opus 4.6 to make me a frontend artifact that used the Anthropic API, and I confirmed that it worked as expected.

      Here is the only relevant part of the prompt it used when calling the API endpoint:

      > - Track the conversation to remember your word and previous guesses