← Back to context

Comment by frabcus

3 days ago

They have somewhat an internal model of arithmetic, with lookup tables and separate treatment of digits. I'm conscious you might have seen this already and not interpret it like that, but in case you haven't section 6 on addition in this Anthropic interpretability paper goes into it.

https://transformer-circuits.pub/2025/attribution-graphs/bio...

Keep in mind that is a basic level of understanding of what is going on in quite a small model (Claude 3.5 Haiku). We don't know what is happening inside larger models.