← Back to context

Comment by muvlon

3 hours ago

Sure, you need a tiny bit of asm to do the actual syscall. That's not what I'm talking about. Most syscalls are easy to wrap, clone is slightly harder but doable (as evidenced by glibc). clone3 is for all intents and purposes impossible to write a general C wrapper for. It allows you to create situations such as threads that share virtual memory but not file descriptors, or vice-versa. That is, it can leave the caller in a situation that violates core assumptions by libc.

You're mixing things up. C the language doesn't know about virtual memory or file descriptions. Those are OS features.

  • The C library maintains its own set of file descriptors, which are mapped to the OS file descriptors (because the stdio file descriptors and the OS file descriptors have different types and different behaviors).

    I do not know whether this is true, but perhaps the previous poster means that using clone3 with certain arguments may break this file descriptor mapping so invoking after that stdio functions may have unexpected results.

    Also the state kept by the libc malloc may get confused after certain invocations of clone3, because it has memory pages that have been obtained through mmap or sbrk and which may sometimes be returned to the OS.

    So libc certainly cares about the OS file descriptors and virtual memory mappings, because it maintains its own internal state, which has references to the corresponding OS state. I have not looked to see when an incorrect state can result after a clone3, but it is plausible that such cases may exist, so that glibc allows calling clone3 only with a restricted combination of arguments and it does not provide a wrapper that would allow other combinations of arguments.

    • Yes; this is why QEMU's user-space-emulation clone syscall handling restricts the caller to only those combinations of clone flags which match either "looks like fork()" or "looks like creating a new pthread", because QEMU itself is linked with the host libc and weird clone flag combinations will put the new process/thread into a state the libc isn't expecting.