Comment by giantrobot
9 months ago
> An LLM DM is exactly what could add that story, make overcoming challenges feel meaningful and allow for player decisions to actually impact the world deeply.
No it's not. I don't think you're going to find an LLM with a large enough context window to have a meaningfully involving story spanning multiple sessions.
An LLM isn't going to craft a story element tailored to a character, or more importantly, an individual player. It's not going to understand Sam couldn't make last week's session. An LLM also doesn't really understand the game rules and isn't going to be able to adjudicate house rules based on fun factor.
LLMs can be great tools for gaming but I think their value as a game master is limited. They'll be no better a game master than a MadLibs book.
> I don't think you're going to find an LLM with a large enough context window to have a meaningfully involving story spanning multiple sessions.
First, you don't need much of any context window because you can finetune the LLM. Don't mistake specific engineering choices and tradeoffs and deployment convenience for intrinsic limitations of the technology.
Second, LLMs like Gemini now have context windows of millions of tokens, corresponding to millions of words. Seems like enough for 'multiple sessions'.
> An LLM isn't going to craft a story element tailored to a character, or more importantly, an individual player. It's not going to understand Sam couldn't make last week's session. An LLM also doesn't really understand the game rules and isn't going to be able to adjudicate house rules based on fun factor.
An LLM can do all of that, and you definitely do not know that they can't.
> They'll be no better a game master than a MadLibs book.
They've been better than a Madlibs book since AI Dungeon 1 which was like 6 years ago.
Have you actually used Gemini? I use it a lot for translation, and its context window is more like 150k tokens, rather than the 2M context window they say it has.
My apologies to all the D&D sessions which take more than 150k tokens, then.
Be that as it may, long context window models which are good are not a mirage. By say late 2027, when the LLM providers figure out that they're using the wrong samplers, they will figure out how to get you 2 million output tokens per LLM call which stay coherent.
> I don't think you're going to find an LLM with a large enough context window to have a meaningfully involving story spanning multiple sessions.
Sure you will.
> An LLM isn't going to craft a story element tailored to a character, or more importantly, an individual player.
Sure it is.
> An LLM also doesn't really understand the game rules and isn't going to be able to adjudicate house rules based on fun factor.
Sure it will.
You need to use the tools for their purpose, not for the opposite of it. LLMs have finite context, you need to manage it. LLMs don't have a built-in loop, you need to supply it.
Character stats, names, details about players - those are inputs, and structured ones at that. LLMs shouldn't store them - that's what storage media are for, whether in-memory or a database or a piece of paper. Nor should they manipulate them directly - that's what game systems are for, whether implemented in code or in a rulebook run on a human DM. LLMs are to make decisions - local, intuitive decisions, based on what is in their context. That could be deciding what a character says in a given situation. Or how to continue the story based on worldbuilding database. Or how to update the worldbuilding database based on what it just added to the story. Etc.
> Character stats, names, details about players - those are inputs, and structured ones at that.
Some details about players are structured and can be easily stored and referenced. Some aren't. Consider a character who, through emergent gameplay, develops a slight bias against kobolds; who's going to pick up on that and store it in a database (and at what point)? What if a player extemporaneously gives a monologue about their grief at losing a parent? Will the entire story be stored? Will it be processed into structured chunks to be referenced later? Will the LLM just shove "lost a father" into a database?
Given current limitations I don't see how you design a system that won't forget important details, particularly across many sessions.
> who's going to pick up on that and store it in a database (and at what point)
LLM might, if prompted to look at it, or if there was a defining moment that could invoke such change. It won't pick on a very subtle change, but then most people reading a story wouldn't either - this is more of the kind of stuff fans read into a story when trying to patch potential continuity issues.
> What if a player extemporaneously gives a monologue about their grief at losing a parent? Will the entire story be stored? Will it be processed into structured chunks to be referenced later? Will the LLM just shove "lost a father" into a database?
The scale depends on the design, but I'd say yes, shoving "lost a father" into a database so it pops up in context is a good first step, the next step would be to ensure the entry looks more like "mentioned they continue to grieve after loss of their father <time ago>", followed by a single-sentence summary of their monologue.
Personally, I had some degree of success with configuring LLM (Claude 3.5 Sonnet) for advising on some personal topics across multiple conversations - the system prompt contains notes in <user_background> and <user_goals> tag-delimited blocks, and instructions to monitor the conversation for important information relevant to those notes, and, if found, to adjust those notes accordingly (achieved by having it emit updates in another magic tag, and me manually apply them to the system prompt).
> Given current limitations I don't see how you design a system that won't forget important details, particularly across many sessions.
It's not possible. Fortunately, it's not needed. Humans forget important details all the time, too - but this is fine, because in storytelling, the audience is only aware of the paths you took, not of the countless other possibilities you missed or decided not to take. Same with LLMs (and larger systems using LLMs as components) - as long as they keep track of some details, and don't miss the biggest, most important ones, they'll do the job just fine.
(And if they miss some trivia you actually care about, I can imagine a system in which you could ask about it, and it'll do what the writers and fans always do - retcon the story on the fly.)
I have solved it in my two games by using two systems in my future game: 1. LLM "text" info ,about world, player, text descriptions of world/decisions and so one. 2. Typical D&D stuff with rolls, names, details, decisions that are simple logic.
His example:
Rimworld is a great universe where we think about characters' stories, and there's really just like, a couple dozen attributes and straightforward ways they interact with the game.
An LLM context window could easily have 20 times as much interpersonal state, and make it interact in much more unexpected (but plausible) ways. That's going to be a surprising and rewarding gaming experience once someone figures it out.
Context window is arbitrary and can be adjusted on demand. When a new random event needs to be generated, and then fleshed out, the context can contain e.g. list of facts about the story so far, the overall story arc, summary of main characters or plot threads. This can be used by LLM to decide e.g. which faction will attack, and who will be on it, and what their goal will be, etc. After the event concludes, it just becomes another line in story event history. Meanwhile, that can be fed to a differently prompted LLM to evolve the plot arc, update motivations of background characters, etc.
I have a feeling people imagine LLMs as end-to-end DMs that should somehow remember everything and do everything in one inference round. That's not what they are. They're building blocks. They're to be chained and mixed with classical flow control, algorithms, and data storage (as well as the whole interactive game mechanics, in videogame context).
Maybe. But for something like Rimworld++++, you don't need all that sophistication. You need a pile of relevant facts and a clear task to determine something about the game or write some text for the user. Sure, curating and retrieving selectively would probably be even better and more flexible.
>I don't think you're going to find an LLM with a large enough context window to have a meaningfully involving story spanning multiple sessions
You don't need to provide every single previous information to llm, use LLM to summarise previous ones and it gets really compact. It works quite well.