Comment by redcobra762

2 years ago

There's got to be a probability cut-off, though. LLMs don't infinitely connect every token with every other token, some aren't connected at all, even if some association is taught, right?

4 comments

redcobra762

the8472 2 years ago

The weights have finite precision which means they represent value-ranges / have error bars. So even if the weight is exactly 0 it does not represent complete confidence in it never occurring.

redcobra762 2 years ago
A weight necessitates a relationship, but I’m arguing LLMs don’t create all relationships. So a connection wouldn’t even exist.
- yorwba 2 years ago
  
  When relationships are represented implicitly by the magnitude of the dot product between two vectors, there's no particular advantage to not "creating" all relationships (i.e. enforcing orthogonality for "uncreated" relationships).
  On the contrary, by allowing vectors for unrelated concepts to be only almost orthogonal, it's possible to represent a much larger number of unrelated concepts. https://terrytao.wordpress.com/2013/07/18/a-cheap-version-of...
  In machine learning, this phenomenon is known as polysemanticity or superposition https://transformer-circuits.pub/2022/toy_model/index.html
  
  1 reply →