Comment by cryptonector
7 hours ago
Variable width encodings like UTF-8 and UTF-16 cannot be indexed in O(1), only in O(N). But this is not really a problem! Instead of indexing strings we need to slice them, and generally we read them forwards, so if slices (and slices of slices) are cheap, then you can parse textual data without a problem. Basically just keep the indices small and there's no problem.
Or just use immutsble strings and look-up-tales. Say, every 32 characters, combined with cursors. This is going to make indexing fast enough for randomly jumping into a striong and the using cursors.