Comment by dhosek

18 hours ago

The complexities of mixed LR and RL text are quite astonishing since it’s not really even a case of just switching modes when switching scripts since double-nested (or more) texts can change the semantics of line breaks. This article provides a good overview: https://tug.org/TUGboat/tb08-1/tb17knutmix.pdf [1]

In college [2], when I wanted to quote some texts from Exodus in Hebrew in a paper that I wrote, I ended up avoiding the issue by hand-reversing the letter order and manually breaking lines. 8 bits is insufficient to cover all the possible combinations of letters and vowel markings so the font didn’t include any vowel markings and only did dageshim for בּ and פּ if I recall correctly.

⸻

1. As an aside, it would have been really nice if Unicode provided a R-L mirrored Latin alphabet to make it easier for monolingual developers to grasp the complexities surrounding mixed directional typesetting. I suppose it could still be added, although Unicode tends towards conservatism on adding additional characters.

2. This was 1990, well before Unicode in the era of a hundred or so 8-bit character encodings, most of which were not implemented widely. I also had to type the text using the arbitrary ASCII-Hebrew mapping of the font I was using which, among other things, led me to discover that letter frequency in Hebrew is much more uniform than it is in English.

4 comments

dhosek

teddyh 8 hours ago

RFC 3986 (STD 66) recommends (in appendix C) delimiting URLs in angle brackets to avoid the problem which your link now has. I.e. if you’d written <https://tug.org/TUGboat/tb08-1/tb17knutmix.pdf>¹ there would have been no problem.

kstrauser 14 hours ago

That link’s a 404.

gus_massa 9 hours ago

The link says
https://tug.org/TUGboat/tb08-1/tb17knutmix.pdf ¹
There is no space between pdf and ¹, so the HN server assumes incorrectly that the ¹ is part of the link.
mschuster91 9 hours ago

Strip the superscript-1 character at the end, I'm surprised HNs link formatter regex detects it as part of the link: https://tug.org/TUGboat/tb08-1/tb17knutmix.pdf