Comment by jerf
3 months ago
Textual language is really, really amazing if you sit down and think about what it does versus the resources it consumes to do it.
It's a common pasttime for programmers to claim that our textual programming languages are just terrible and need to be replaced somehow with something visual, but I think this very often comes from a place of not understanding just how amazing textual languages are. Not they couldn't possibly be improved by something in at least some domains, and there are after all some successful niches for visual languages, but I think if you set out to wholesale replace textual languages without an understanding of and appreciation for the impressive nature of the competition they offer you're setting yourself up to fail.
This also touches on the contrast between how human beings and LLM's trade compression for nuance. Human beings have enormous resources devoted to long-tailed distribution of information, for example in lexical items. Word distributions follow Zipf's Law, so like in the million word FROWN corpus, roughly half the words only occur one time. Like when's the last time you use the word chrysanthemum, or corpulent? But did you have any difficulty recognizing them? So while human beings have limited scale compared to machines, we do have an enormous capacity for nuanced, communication and conception.
Whereas LLM's make the opposite trade-off. There are information centric theory limitations on the amount of information LM's can store (roughly 3.6 bits per parameter) so they aggressively compress information and trade away nuance (https://arxiv.org/abs/2505.17117).