Comment by eterm

13 hours ago

Reading more I'm a little disappointed that the write-up has seemingly leant so heavily on LLMs too, because it detracts credibility from the study itself.

Fair point. The core simulation and data collection was done programmatically - 162 games, raw logs, win rates. The analysis of gaslighting phrases and patterns was human-reviewed. I used LLMs to help with the landing page copy, which I should probably disclose more clearly. The underlying data and methodology is solid, you can check it here: https://github.com/lout33/so-long-sucker