Comment by chrishill89
1 year ago
I recently wanted to split and iterate on a String (`char *`) in the Git project. I was not feeling good that it had come to this point since I'm not a C programmer. ChatGPT told me that I wanted `strtok`. So I tried that but then the compiler complained that it was banned (via a macro).[1]
Looking at the commit log I found out that `string_list.h` in the project was something that I could use. There's functions and initializers for duplicating and not duplicating the string. So I chose `STRING_LIST_INIT_DUP` and used `string_list_split` on `\n`.[2] Then I could use a regular for-loop since the struct has `nr` which is the count. And then it just worked.
I was pleasantly surprised that there was such a nice library for "list of strings" in C.[2] And the price you pay compared to more modern languages is an extra allocation (or one per sub-string?). And although I have only used it for that one thing I didn't run into any footguns.
Languages designed way after C (like in this century) should keep in mind what they want to give the user the ability to do: just make simple immutable types (like `String` in Java (of course there's `StringBuilder)), give immutable/mutable pairs, and also if they want to give access to no-allocation slices (like Rust). And if they go all the way with this they really need to give a large enough API surface so that all of these different concerns don't get mixed up. And so that users don't have to think about the underlying arrays all the time in order to not do something unintentional.
[1] Apparently `strtok` is not nice to use because it can change the string you pass to it or something. I didn't read a lot about it.
[2] The documentation tells you when you should be using dup/no-dup. And it makes sense to duplicate/allocate in this case, I think. Intuitively to me it seems that I'm paying for an allocation in order to not have to mess up per-char iteration at all. And that was fine for my use of it.
[3] And that the macro stopped me from going down the wrong path with `strtok`!
I haven’t used strtok in a long time but my recollection is that it mutates the original string by placing a NULL value at the next delimiter so “hello world” would become “hello<0x00>world” if splitting on spaces. This lets you loop with the same string passed to strtok until it is done.
It’s ugly. Would not recommend. 0/10