Comment by andrey-p
11 years ago
> I don't think my brain interprets letters individually, but my eyes mostly catch word by word, hence a picture.
That's a normal thing. You read words based on their outline rather than individual letters, what with the bits that stick upwards (ascenders) and downwards (descenders). Which is why ALL CAPS TEXT IS A LOT HARDER TO READ QUICKLY, and why if you jmuble some lettres in a few wrods you can still read them with relative ease.
So the way humans read text is still kind of image based - you match a word based on a preexisting idea of what they should look like. It's just they're coincidentally easy to read by computers, too.
To expand on that perspective, words signify small, almost atomic, concepts which are relatively easy to learn - optimal elements with which to express less universal and more complex concepts. Each word can be viewed as a shorthand to a already previously defined expression.
In that context, letters would be to words what pixels are to bitmaps.
Make words more complex (also longer) and the set of those elements will be able to encompass a wider, more versatile set of "elementary" concepts (akin to a wide tree structure), but this will also be harder to learn within a reasonable amount of time. In case of pixels this is comparable to higher bit-depth.
Yes, nobody knows the full vocabulary of any natural language, but we still understand each other due to knowing a common subset of words, having redundancy within and between sentences passages etc.
Make words less complex and they will encompass a narrower set of concepts, but the whole set of them are easier to learn. You will usually have to use more of them to express concepts (deeper tree structure) though. Similarily, one would need to use a larger amount of pixels with low bit-depth to express intermediate colors.
This is comparable to having well named functions in program code - partition the program into functions well enough and you will have created a set of relatively universal concepts, which the another person might understand without delving too much into the body of the implementing function each time.
Similar relationships can be perceived on the word-sentence and sentence-paragraph level, of course. Therefore the difference between pictures and text has more to do with partitioning information between abstraction levels than with some fundamental difference.
The optimal way of partitioning data varies depending on content.
For example, concepts of left and right are inherently connected with our visual and spacial perception of the world and are therefore better expressed by invoking our spacial recognition (a two dimensional image). That is because definitions for these atomic concepts are practically hardwired and require no learning.
Therefore, finding the best way to express information for humans and computers alike is akin to finding the optimal point between two extremes of reduced set of simple concepts and a larger set of complex concepts, that are easy enough to parse by both.
In the case of humans, the physical medium will always most likely be eyes for read mode due to built-in parallel processing and high bandwidth, and a subset of our muscles (currently fingers) for write mode. I'm not sure how fast can we successfully parse audio signals, and brain-to-computer interfaces are still too slow.
As for the partitioning of information - who knows? Physically we have colors, brightness, shapes, sounds, temperature, touch, and more at our disposal. But the best way depends on our brain, and what size of information units it is best equipped to process.
And that is definitely not 8 bytes.