Comment by armchairhacker 5 months ago Any suggestions from this literature? 1 comment armchairhacker Reply libraryofbabel 5 months ago The papers from Anthropic on interpretability are pretty good. They look at how certain concepts are encoded within the LLM.
libraryofbabel 5 months ago The papers from Anthropic on interpretability are pretty good. They look at how certain concepts are encoded within the LLM.
The papers from Anthropic on interpretability are pretty good. They look at how certain concepts are encoded within the LLM.