Comment by redcobra762

2 years ago

That’s not right; there are many vectors that go unbuilt between unrelated tokens. Creating a ton of empty relationships would obviously generate an immense amount of useless data.

Your links are not about actually orthogonal vectors, so they’re not relevant. Also that’s not what superposition is defined as in your own links:

> In this paper, we use toy models — small ReLU networks trained on synthetic data with sparse input features — to investigate how and when models represent more features than they have dimensions. We call this phenomenon superposition