Comment by viraptor
6 hours ago
If you want to really test this, search/replace the names with your own random ones and see if it lists those.
Otherwise, LLMs have most of the books memorised anyway: https://arstechnica.com/features/2025/06/study-metas-llama-3...
Being that it has the books memorized (huh, just learned another US/UK spelling quirk), I would suppose feeding it the books with altered spells would get you a confused mishmash of data in the context and data in the weights.
Couldn't you just ask the LLM which 50 (or 49) spells appear in the first four Harry Potter books without the data for comparison?
It's not going to be as consistent. It may get bored of listing them (you know how you can ask for many examples and get 10 in response?), or omit some minor ones for other reasons.
By replacing the names with something unique, you'll get much more certainty.
might not work well, but by navigating to a very harry potter dominant part of latent space by preconditioning on the books you make it more likely to get good results. An example would be taking a base model and prompting "what follows is the book 'X'" it may or may not regurgitate the book correctly. Give it a chunk of the first chapter and let it regurgitate from there and you tend to get fairly faithful recovery, especially for things on gutenberg.
So it might be there, by predcondiditioning latent space to the area of harry potter world, you make it so much more probable that the full spell list is regurgitated from online resources that were also read, while asking naive might get it sometimes, and sometimes not.
the books act like a hypnotic trigger, and may not represent a generalized skill. Hence why replacing with random words would help clarify. if you still get the origional spells, regurgitation confirmed, if it finds the spells, it could be doing what we think. An even better test would be to replace all spell references AND jumble chapters around. This way it cant even "know" where to "look" for the spell names from training.
btw it recalls 42 when i asked. (without web search)
full transcript: pastebin.com/sMcVkuwd
Not sure how they're being counted, but that adds up to 46 with the pair spells counted separately. But then nox is counted twice, so maybe 45.
No, because you don't know the magic spell (forgive me) of context that can be used to "unlock" that information if it's stored in the NN.
I mean, you can try, but it won't be a definitive answer as to whether that knowledge truly exists or doesn't exist as it is encoded into the NN. It could take a lot of context from the books themselves to get to it.