← Back to context

Comment by Intrinisical-AI

2 days ago

Wow! you brought up several deep ideas that deserve unpacking step by step (as if we were LLMs):

- On the manifold being “high-dimensional” (e.g., 2999): I got your intuition; the set of valid linguistic sequences is tiny relative to the space of all possible strings, but still enormously rich and varied. So the valid set doesn’t fill the whole space, but it also can’t live in a low-dimensional manifold like 20D. But I'm also not so sure about that: how many ways you have to give an accurate response? Hard to argue than many more than one. Hard to argue even one of them it's completly correct. _There must be some sort of "clustering"_.

- On domain-specific manifolds and semantic transitions: 100% agree with your idea that different domains induce distinct geometric structures in embedding space, and even that the idea of a "simple manifold", seems to optimistic. But what about "regions" with common (geometric / topological) properties? Eg: Physics should? form a dense structured region, and I guess there common patterns between the implicit structure of it's subspace, and the maths' one for example. The semantic trajectories inside each domain will follow specific rules, but patterns must exists, and also should be transitional zones or “bridges” between them. I relate the emergen abilities into LLMs with this (what are LLMs but transformers of vectorial representations, taken by "views / parts / projections" (e.g: multi-attention heads).

What if we hypothetize about chart atlases; multiple local coordinate systems with smooth transition maps? Maybe a patchwork of overlapping manifolds, each shaped by domain-specific usage, linked by pathways of analogy or shared vocabulary.. Even if this is the case (we only guessing), the problem it's that the computational costs, and interpretations are neither trivial.

- On GloVe and the projection fallacy: I take your point, you can always "cherry-pick" the most good loonking examples to tell your story haha

- On symplectic spaces: I don’t know enough about symplectic geometry :( Only think: you got me thinking about hyperbolic spaces where volume grows exponentially; anti-intuitive from an euclidean point of view

- “maybe the flat embedding space doesn’t devote volume to gibberish because it was never trained to model gibberish.”

I initially thought of this as a kind of "contraction", but that term might be misleading - but thinking about it I prefer the idea of density redistribution. Like a fluid adapting to an invisible container --> Maybe it's like a manifold emergence through optimization pressure indirectly sculpted by the model’s training dynamics.

- Wheeler Superspace: Again, I cannot follow you :( I guess you're pointing that the semantical relationships could be formulated as discrete.. BUT, as a non-physicist, I honestly can’t tell the (any?) difference between being modeled as discrete vs being discrete. (xD)

Thank for the deep response, Paul! Its a pleasure having this conversation with you.