← Back to context

Comment by DarkWiiPlayer

4 years ago

A friend had the username "Rubén" and jfc it broke everything other than windows itself xD

The problem isn't the Cyrillic or the é but the fact that Windows lets you put those characters in file names in non-Unicode encodings which will create sequences of bytes which are invalid UTF-8. It's 2021, FFS, stop using legacy encodings.

  • All win32 functions that accept or return strings come in two varieties, with A and W suffixes, MessageBoxA/MessageBoxW. The A works with the system default 8-bit encoding (cp1251 in case of Cyrillic), the W works with unicode in wide chars. There shouldn't be much of a problem with string handling if you stick exclusively with W functions.

    • Using the W functions has been the advice from Microsoft's documentation for ages. But people still use the A functions because they're easier, especially when writing cross-platform software since Windows is the only major OS that made the unfortunate choice of having the base character type 16 bits wide.

      Fortunately the future of the Windows API does look better since Microsoft has now added proper UTF-8 support since Win 10 1904. All you have to do is request it in the application manifest and the A functions will accept and return UTF-8.

      5 replies →