Comment by tehjoker
18 days ago
The part about Claude's wellbeing is interesting but is a little confusing. They say they interview models about their experiences during deployment, but models currently do not have long term memory. It can summarize all the things that happened based on logs (to a degree), but that's still quite hazy compared to what they are intending to achieve.
you can snapshot layer activations any time you want...