Comment by quotemstr

13 years ago

You do realize that NT is from 1989 and that UTF-8 is from 1992, right?

6 comments

quotemstr

Wikipedia tells me that Windows NT was started in 1988 and shipped in 1993, while Linux's first release was in 1991, giving them approximately the same age. Furthermore, NT shipped with UCS-2 support, not UTF-16, as UTF-16 did not exist until 1996. UTF-16 was a migration that they completed in Windows 2000. UTF-8 was first presented in 1993 and is, therefore, older than UTF-16.

With all of these facts being true (I hope), today things are such that the Windows world mostly (but not completely -- they sometimes assume two-byte characters) correctly implements UTF-16, while the Linux world correctly implements UTF-8.

EdiX 13 years ago

Before 1996 (when 2.0 was released) Unicode was a 16bit fixed width encoding. And the first astral character allocations didn't happen until 3.1 was release in 2001.

termie 13 years ago

Yeah. Which makes it even more irritating.. MS is still off in the double-wide stix 11 years later and still promoting UTF-16LE in userland. From http://msdn.microsoft.com/en-us/library/dd374081%28VS.85%29....

"Unicode-enabled functions are described in Conventions for Function Prototypes. These functions use UTF-16 (wide character) encoding, which is the most common encoding of Unicode and the one used for native Unicode encoding on Windows operating systems. Each code value is 16 bits wide .. New Windows applications should use UTF-16 as their internal data representation."

philsnow 13 years ago

"Most common" encoding is best encoding!
ugh
jeltz 13 years ago
I seriously doubt that UTF-16 still is the most common encoding. The web is mostly UTF-8 and so are most smartphones.
- sratner 13 years ago
  
  JavaScript strings are UCS-2 or UTF-16.