Comment by pornel

16 hours ago

Unicode wanted ability to losslessly roundtrip every other encoding, in order to be easy to partially adopt in a world where other encodings were still in use. It merged a bunch of different incomplete encodings that used competing approaches. That's why there are multiple ways of encoding the same characters, and there's no overall consistency to it. It's hard to say whether that was a mistake. This level of interoperability may have been necessary for Unicode to actually win, and not be another episode of https://xkcd.com/927

Why did Unicode want codepointwise round-tripping? One codepoint in a legacy encoding becoming two in Unicode doesn't seem like it should have been a problem. In other words, why include precomposed characters in Unicode?