← Back to context

Comment by kelseyfrog

4 days ago

Last year, during a period of recovery, I hosted a ChatGPT game with my nieces and newphew. I played the role of being a bride between them and the LLM. I read aloud the text, and translated their actions into queries.

With slight nudges, GPT performed well. It excelled at NPC dialog, location description, and to some extent improvisation. However, eventually the campaign fell apart on session four or five. There was a disagreement between the players over exactly how many maps we had acquired. As you know, answering this question is a needle in a haystack problem and asking the LLM proved to illicit conflicting responses. It ultimately led to me concluding that pausing the campaign was the right choice and accepting that we all still had an amazing time up to that point.

Which leads me back to the questions in the paper: > We believe that with the above evaluation methods we can try to answer the following questions: • How consistent is a LLM Agent in its generative dialogue? • How well can a LLM Agent keep a user engaged in a narrative plot? • How creative is a LLM Agent when generating complex story driven narratives?

None of these are exactly relevant to my personal experience running a TTRPG game using ChatGPT and what makes it succeed or fail. What would make such a system an improvement is being able to store and retrieve facts about players (HP, stats, inventory), NPCs(names, location, history), and Places(items, descriptions, inhabitants) in order to solve the needs-haystack problem. Creating a world ontology and being able to CRUD facts against such an ontology would be a game changer(literally).

Despite the ultimate demise of the game, the experiment was a success. I'd highly encourage anyone with a laptop, a handful of dice, and a group of interested friends to give a shot. It's fun, memorable, and a unique way to pass the time and bond.

> What would make such a system an improvement is being able to store and retrieve facts about players (HP, stats, inventory), NPCs(names, location, history), and Places(items, descriptions, inhabitants) in order to solve the needs-haystack problem. Creating a world ontology and being able to CRUD facts against such an ontology would be a game changer(literally).

Curiously, those are the exact things LLMs can't do well, and are not really supposed to. This is all stuff you (or an external system) should keep track of and supply to LLM in context "just in time". This is to say, LLMs are an important component, but alone they won't work. They need their tools :).

In short, LLMs are pure "system 1", the complete solution still needs "system 2", which fortunately is more amenable to classical computing approaches.