Comment by rmunn
8 hours ago
I was just mentioning the Japanese word mojibake on the plain-text thread (https://news.ycombinator.com/item?id=47897681), and here you give an example. In fact, UTF-8 misinterpreted as Windows-1252 is the mojibake I personally encounter most often. Curly quotes (most often a right apostrophe inside a word like can't or it's or didn't) are the most common ones, with em dashes being only slightly less common. The other direction (Windows-1252 text being read as UTF-8) produces � (U+FFFD) everywhere instead, but either way, I still see those from time to time today. But far, FAR less frequently than I used to back in the late 2000's or early 2010's. I used to see — and similar sequences all the time 15-20 years ago, and now it's rare enough that I actually notice when it happens.
No comments yet
Contribute on Hacker News ↗