Comment by psionides
3 days ago
Sorry, I keep mixing these - bytes instead of scalars, which I think would be more natural to iterate over in most languages (at least the ones I use).
3 days ago
Sorry, I keep mixing these - bytes instead of scalars, which I think would be more natural to iterate over in most languages (at least the ones I use).
OK, checked and Ruby does seem to use scalars. Well, unless you mess with encodings. Then it’s messy. So it’s probably better and worse than Python 3.
You may not have seen this interesting article before: https://hsivonen.fi/string-length/. I agree with its assessment that scalars are really pretty useless as a measure, and Python and Ruby are foolish to have chased it at such expense.
But seriously, I can’t think of any other popular languages that count by scalars or code points—it’s definitely not most languages, it’s a minority, all a very specific sort of language. “Most” encompasses well-formed UTF-8 (e.g. Rust), recommended UTF-8 but it doesn’t actually care (e.g. Go), potentially ill-formed UTF-16 (e.g. JavaScript, Java, .NET), and total-mess (e.g. C, C++).
Thanks, will have a read :)