← Back to context

Comment by zzo38computer

2 days ago

I agree with most of the criticisms they make.

I agree that pointer and length is better than null-terminated strings (although it is difficult in C, and as they mention you will have to use a macro (or some additional functions) to work this in C).

Making the C standard library directly against syscalls is also a good idea, although in some cases you might have an implementation that needs to not do this for some reason, generally it is better for the standard library directly against syscalls.

FILE object is sometimes useful especially if you have functions such as fopencookie and open_memstream; but it might be useful (although probably not with C) to be able to optimize parts of a program that only use a single implementation of the FILE interface (or a subset of its functions, e.g. that does not use seeking).

Making every C call a system call is not a good idea at all - think about malloc() etc - the OS shouldn’t care about individual allocations and only worry about providing brk() etc. otherwise, performance will die if you’re doing a thousand system calls per second!

  • No modern libc uses (or should use) brk() as the heap. Allocate virtual memory using mmap, VirtualAlloc, etc., and manage your set of heaps.

Null terminated strings have some merits but they should be a completely different data type like in Freebasic.

  • Are there other merits than availability of literals in C?

    It seems like one of the worst data structures ever - lookup complexity of a linked list with a expansion complexity of an array list with security problems added as a bonus.

    • One I can think of is simplicity. No need to worry about what the type of the string should be (size_t?) or where it should be stored. Just pass around a pointer. Pointers fit the size of a CPU register most of the time. Though in my opinion the drawbacks (O(N) performance, NUL forbidden etc.) outweigh this benefit we are stuck. Many kernel interfaces like open, getdents etc. assume NUL-terminated strings, therefore any low-level language or library has to support them.

      3 replies →

    • It's fine as a serialization/deserialization primitive for on-disk files, as long as the NULL character is invalid.

      String tables in most object file formats work like that, a concatenated series of ASCIIZ strings. One byte of overhead (NUL), requires only an offset into one to address a string and you can share strings with common suffixes. It's a very compact layout.

      2 replies →

    • When using null terminated strings, parsing can be branchless because you don't need bounds checks and can use a jump table indexed by the byte.

  • Hearing someone mention FreeBASIC really brings me back. It was the first language I ever used pointers in.