Comment by jjmarr
1 day ago
> Just don't allow casting to u24, as it makes no sense unless you define u24 to be u32 sized as I think c standard does.
The reason u32->u24 casting must be well defined is because some hardware (e.g. many GPUs, microcontrollers) only have floating point multipliers. A 24 bit unsigned integer (stored in a 32 bit register) can be losslessly converted to a 32 bit float by the hardware, multiplied, then converted back.
This is much faster than doing 32 bit multiplication in software, however, you still need to tell the compiler about this constraint.
I am criticizing the part where they allowed [3]u8 to u24 bitCast in the first place. It doesn't make sense logically as u24 is likely not 24 bits in any targets let alone portably on every target.
Interpreting u24 like it is actually 24 bits sounds like programming in crazy land since it is not 24 bits in any relevant architecture afaik.
They didn't allow []u24 with a similar rationale as far as I can remember. I agree with this as someone programming at this level should be able to understand there is no real u24 layout and they should use []u32. Going with the same magical rational they went with here, compiler should generate unaligned u24 loading code when you use []u24 since it is "logically 24 bits"
The ease of dealing with arbitrary bit-width integers and packed structs is actually one of the 'killer features' for me in zig.
Zig natively supports arbitrary bit-width integers, the ABI is defined and you could simply think it as a slice of the next larger backing integer.
The[3]u8 to u24 bitCast will simply be backed by a 32bit int, using the same ABI. As you have u1 - u65535, sometimes it can be multiple words.
The 24 Bits (3 Bytes) [3]u8 to u24 example is exactly related to utf-8 that covers all the languages but excludes the emojis.
There are very valid use cases when you want to limit utf-8 to U+0000-U+FFFF, and it is valuable if your language allows you to make those decisions.
Remember, in zig packed structs are just integers and integers are just a group of logically consecutive bits.
Arrays like []u24 do not have the same ABI, arrays are not bit/byte packed, are not universally LSB across archs etc..
The compiler isn't producing unaligned code, don't confuse the abstraction with the concrete implementation. And yes [8]u1 and [8]u8 are exactly the same size and shape, even though they are arrays.
My current project is parsing ELF/Macho files, I can easily have zero allocations in my hot path with zig, the same is far more challenging in C, so I am biased, especially with zig allowing methods on structs.
And yes, I do use that crazy casting to 0xdeadbeef and other ascii metadata that is in those files.
To be clear here, I am not trying to prove you wrong, this is one of the places zig is very different and (IMHO) useful. Especially with streaming data or where you have network ordering etc... It is so nice to only cast what you need to but it does take a little while to wrap your head around how this interacts with buffers which are not your native endianness. At least for me, once I figured out to separate the shape of those data streams from their values it was super useful.
> The 24 Bits (3 Bytes) [3]u8 to u24 example is exactly related to utf-8 that covers all the languages but excludes the emojis.
I'm not familiar with Zig, so maybe it's doing something weird here, but that doesn't really make sense with Unicode in general.
First, the largest Unicode codepoint that will ever be allocated is U+10FFFF [0], which is less than 2^21, so all Unicode characters will fit in a 24-bit integer. Perhaps you're thinking of UCS-2 or UTF-16 without surrogates, which are both 16 bits wide and are limited to the BMP [1] [2] (and therefore don't include most emojis).
Second, while the characters needed for most languages lie within the BMP, not all of them do [3], so it isn't really possible to support all languages while excluding emoji, aside from using the Unicode character database to exclude certain categories [4] [5].
[0]: https://www.unicode.org/faq/utf_bom.html#gen0
[1]: https://www.unicode.org/faq/utf_bom.html#utf16-11
[2]: https://en.wikipedia.org/wiki/Universal_Coded_Character_Set
[3]: https://en.wikipedia.org/wiki/Plane_(Unicode)#Supplementary_...
[4]: https://www.unicode.org/reports/tr44/tr44-34.html#General_Ca...
[5]: https://en.wikipedia.org/wiki/Unicode_character_property#Gen...
5 replies →
> ... utf-8 that covers all the languages but excludes the emojis ...
Ah, but the U+0000 to U+FFFF plane does not cover all the languages. You might think that only historical and archaic languages are found in Unicode's astral planes (e.g., U+20000 to U+2A6DF is used for historical Chinese characters no longer used today), but in fact there are modern languages found in the U+10000 plane.
You might not care about Osage (the language of the Osage Nation of northern Oklahoma) since its last native speaker passed away in 2005, but there is a revival program trying to teach Osage to people. Osage's script was developed quite recently as part of the revival program, so it couldn't fit into the U+0000 to U+FFFF block and it was assigned U+104B0 to U+104FF.
The Toto language of Bengal, on the other hand, is still active: over 1000 speakers, all living in the village of Totopara. It also never had an alphabet until recently, so its Unicode block is U+1E290 to U+1E2BF.
Then there's Wancho, spoken by about 60,000 people in India. Its alphabet was created between 2001 and 2012, and added to Unicode in 2019. It was assigned the U+1E2C0 to U+1E2FF block (immmediately after the Toto language, you might notice).
Then there's the Ho language spoken by over a million people in India. Wikipedia cites a 2001 census as having 2.2 million speakers, and a 2011 census as having 1.4 million speakers. I very much doubt that both of those are accurate (you don't lose half a million people from an ethnic group in just ten years without some kind of war or genocide, and the Wikipedia article would have at least mentioned that if such a thing had happened), but to be safe, let's go with the lower estimate and say that at least one and a half million people speak Ho. It can be written with the Latin alphabet, but its own alphabet is Warang Chiti (sometimes spelled Warang Citi), which was added to Unicode in 2014 and assigned the U+118A0 to U+118FF block.
And then there's the Adlam script for writing Fulani, the language of the Fufulde people of western Africa. Fulani is spoken natively by 37 million people, and as a second language by another 2.7 million. Adlam's Unicode block is U+1E900 to 1+1E95F.
So if you restrict your program to only working with the basic multilingual plane, it's not just emoji you'll be leaving out. It's also modern languages, spoken by anywhere from 1000 people to 37 million. How many speakers of a language are enough to draw the line and say "No, I won't ever translate my software into your language"?
Now, if your software is only targeting one language and you never intend to translate it, then yes, you'll only lose out on emoji if you stick to the U+0000 to U+FFFF range of the basic multilingual plane.
But realize that the higher planes are not just for dead languages. Living languages have ended up there too, and there are likely to be more in the future. It's quite possible that right now, someone somewhere is saying "Hey, why doesn't my language have its own alphabet instead of using Latin characters to write it? The Latin characters don't express the sounds of my language very well." And when they do get that alphabet worked out and manage to get it accepted into Unicode, it'll certainly land in one of the higher planes. Most likely the U+10000 to U+1FFFF plane which isn't at all full yet, but who knows. If you want to be able to handle every language spoken (and written) in the world today, you must be able to accept the full range of Unicode, not just the 16-bit range.
2 replies →
> many GPUs
Citation please - every single GPU in the literal world supports integer arithmetic for operating on tid, gid, etc.
From page 175 of the AMD CDNA4 ISA:
https://www.amd.com/content/dam/amd/en/documents/instinct-te...
> V_MUL_U32_U24
>,Multiply two unsigned 24-bit integer inputs and store the result as an unsigned 32-bit integer into a vector register. D0.u32 = 32'U(S0.u24) * 32'U(S1.u24)
> Notes
> This opcode is expected to be as efficient as basic single-precision opcodes since it utilizes the single-precision floating point multiplier. See also V_MUL_HI_U32_U24.
Nvidia GPUs used to do the same thing and theres a umul24 intrinsic if you care to use it.
https://stackoverflow.com/questions/5544355/cuda-umul24-func...
This is super-super-niche since it basically only applies to 32-bit integer multiplication.
You likely won't run into it unless you're doing high performance embedded systems or GPU programming on non-NVDIA cards, and for some unknowable reason, your workload does a 32-bit integer multiplication in the hot path.
That's literally only for 32bx24b (I don't remember why we did that specifically for CDNA - I'll ask someone) but as you see from V_MUL_HI_I32, V_MUL_LO_U32 there is very much vector arithmetic hardware (nevermind that we're not talking about VALU but conventional scalar ALU).
1 reply →
While the GP might be technically wrong in a narrow sense, GPUs are built for FP, and that's what you want to be doing if you're using them as accelerators.
You don't know what you're talking about: an enormous amount of TOPs now runs through quantized (read: integer) kernels. Many GPUs don't have even FP64 or even FP32 support.
2 replies →