Comment by fithisux
2 days ago
Null terminated strings have some merits but they should be a completely different data type like in Freebasic.
2 days ago
Null terminated strings have some merits but they should be a completely different data type like in Freebasic.
Are there other merits than availability of literals in C?
It seems like one of the worst data structures ever - lookup complexity of a linked list with a expansion complexity of an array list with security problems added as a bonus.
One I can think of is simplicity. No need to worry about what the type of the string should be (size_t?) or where it should be stored. Just pass around a pointer. Pointers fit the size of a CPU register most of the time. Though in my opinion the drawbacks (O(N) performance, NUL forbidden etc.) outweigh this benefit we are stuck. Many kernel interfaces like open, getdents etc. assume NUL-terminated strings, therefore any low-level language or library has to support them.
But (i32 length, byte[] data) is as complex as (byte[] data, '\0'), its two-parts anyway. Of course it allows potentially for very long strings at the cost of just a single byte spent as a terminator. Beside the rarity of such a case, the "space savings" might play a role on a PDP11, or on a Z80, but not on any of the modern architectures that need structures aligned to 32 or even 64 bit boundary. The efficiency and security costs far outweigh any savings is space or simplicity (heh) of processing.
Null-terminated strings are the other billion-dollar mistake, along with the original NULL.
2 replies →
It's fine as a serialization/deserialization primitive for on-disk files, as long as the NULL character is invalid.
String tables in most object file formats work like that, a concatenated series of ASCIIZ strings. One byte of overhead (NUL), requires only an offset into one to address a string and you can share strings with common suffixes. It's a very compact layout.
Nothing prevents you from using a shared pool of strings that don't have null terminator. It can even be more efficient, since you don't have the null byte to handle at string end. Depending on the maximum string length you want to support, it doesn't even have to take more space.
1 reply →
When using null terminated strings, parsing can be branchless because you don't need bounds checks and can use a jump table indexed by the byte.
Hearing someone mention FreeBASIC really brings me back. It was the first language I ever used pointers in.