Comment by tekne
3 months ago
Weird anecdote, but one of the reasons I have always struggled with writing is precisely that my process seems highly nonlinear. I start with a disjoint mind map of ideas I want to get out, often just single words, and need to somehow cohere that into text, which often happens out-of-order. The original notes are often completely unordered diffusion-like scrawling, the difference being I have less idea what final the positions of the words were going to be when I wrote them.
I can believe that your abstract thoughts in latent space are diffusing/forming progressively when you are thinking.
But I can't believe the actual literal words are diffusing when you're thinking.
When being asked: "How are you today", there is no way that your thoughts are literally like "Alpha zulu banana" => "I banana coco" => "I banana good" => "I am good". The diffusion does not happen at the output token layer, it happens much earlier at a higher level of abstraction.
Or like this:
"I ____ ______ ______ ______ and _____ _____ ______ ____ the ____ _____ _____ _____."
If the images in the article are to be considered an accurate representation, the model is putting meaningless bits of connective tissue way before the actual ideas. Maybe it's not working like that. But the "token-at-a-time" model is also obviously not literally looking at only one word at a time either.