← Back to context

Comment by imtringued

10 hours ago

The only way the author could have come up with that rationale is that he doesn't understand what a token is, what attention is and how coding agents work.

Tokens combine multiple characters into a single vector. Attention computes similarity scores between vectors. This means you'd want each variable to be a single token so that the LLM can instantly know that two names refer to the same variable. If everything is numbered, the attention mechanism will attend every first parameter to every first parameter in every function. This means that the numbering scheme would have to be randomized instead of starting at zero.

Coding agents are now capable of using tools, including text search, which means that having the ability to look for specific variable names is extremely helpful. By using numbering, the author of the language has now given himself the burden of relying entirely on LSPs rather than innate model properties that operate on the text level.

So yeah, on a textual level, the language is designed for an era of LLMs that has been obsolete for a long time.