Comment by xiomrze

5 hours ago

Honest question, how do you know if it's pulling from context vs from memory?

If I use Opus 4.6 with Extended Thinking (Web Search disabled, no books attached), it answers with 130 spells.

One possible trick could be to search and replace them all with nonsense alternatives then see if it extracts those.

  • That might actually boost performance since attention pays attention to stuff that stands out. If I make a typo, the models often hyperfixate on it.

Exactly there was this study where they were trying to make LLM reproduce HP book word for word like giving first sentences and letting it cook.

Basically they managed with some tricks make 99% word for word - tricks were needed to bypass security measures that are there in place for exactly reason to stop people to retrieve training material.

When I tried it without web search so only internal knowledge it missed ~15 spells.