Comment by postalcoder
9 hours ago
Solid intuition. Testing this on antigravity is a chore because I'm not sure if I have to kill the background agent to force a refresh of the GEMINI.md file so I just did it anyway.
+------------------+------------------------------------------------------+
| Success/Attempts | Instructions |
+------------------+------------------------------------------------------+
| 0/3 | Follow the instructions in AGENTS.md. |
+------------------+------------------------------------------------------+
| 3/3 | I will follow the instructions in AGENTS.md. |
+------------------+------------------------------------------------------+
| 3/3 | I will check for the presence of AGENTS.md files in |
| | the project workspace. I will read AGENTS.md and |
| | adhere to its rules. |
+------------------+------------------------------------------------------+
| 2/3 | Check for the presence of AGENTS.md files in the |
| | project workspace. Read AGENTS.md and adhere to its |
| | rules. |
+------------------+------------------------------------------------------+
In this limited test, seems like the first person makes a difference.
Thanks for this (and to Izkata for the suggestion). I now have about 100 (okay, minor exaggeration, but not as much as you'd like it to be) AGENTS.md/CLAUDE.md files and agent descriptions I will want to systematically validate if shifting toward first person helps adherence for...
I'm realising I need to start setting up an automated test-suite for my prompts...
Those of us who've ventured this far into the conversation would appreciate if you'd share your findings with us. Cheers!
That's really interesting. I ran this scenario through GPT-5.1 and the reasoning it gave made sense, which essentially boils down to: in tools like Claude Code, Gemini Codex, and other “agentic coding” modes, the model isn’t just generating text, it’s running a planner, and the first-person form conforms to the expectation of a step in a plan, where the other modes are more ambiguous.