Comment by zackangelo
2 months ago
This definitely happens, and I'm surprised it's not talked about more often. Some attention kernels are more susceptible to this than others (I've found that paged attention is better than just naive attention, for example).
To be fair, I suppose people do it too - if you ask me a question about A, often as not the answer will be coloured by the fact that I just learnt about B.