← Back to context

Comment by khafra

5 months ago

LLM model interpretability also uses Sparse Autoencoders to find concept representations (https://openai.com/index/extracting-concepts-from-gpt-4/), and, more recently, linear probes.

0 comments

khafra

Reply

No comments yet

Contribute on Hacker News ↗