Comment by oguz-ismail

5 days ago

system() involves fork()ing, setting up signal handlers, exec()ing and wait()ing. You won't be replacing it with exec, most of the time you'll be reimplementing it for absolutely no reason.

Python has os.spawnl, os.spawnv, etc., which fork()s, wait4()s, etc., without involving a shell. This is much better; this is the library function you should be using instead of system() most of the time. Unfortunately I don't think glibc has an equivalent!

    strace -o tmp.spawnlp -ff python3 -c 'import os; os.spawnlp(os.P_WAIT, "true", "true")' 

In parent:

    clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdc03233310) = 225954
    wait4(225954, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 225954
    --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=225954, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

In child:

    set_robust_list(0x7fdc03233320, 24)     = 0
    gettid()                                = 225954
    clock_gettime(CLOCK_MONOTONIC, {tv_sec=2458614, tv_nsec=322829153}) = 0
    clock_gettime(CLOCK_MONOTONIC, {tv_sec=2458614, tv_nsec=323030718}) = 0
    execve("/usr/local/bin/true", ["true"], 0x7ffdc5008458 /* 44 vars */) = -1 ENOENT (No such file or directory)
    execve("/usr/bin/true", ["true"], 0x7ffdc5008458 /* 44 vars */) = 0

Here, I think strace shows clone() rather than fork() because glibc's fork() is a library function that invokes clone(), rather than a real system call.

  • > Python has os.spawnl, os.spawnv, etc., which fork()s, wait4()s, etc., without involving a shell.

    Good. How do you pipeline commands with these?

    • These functions can't do it. In Python you have to use the subprocess module if you want to pipeline commands without the bugs introduced by the shell. From https://docs.python.org/3.7/library/subprocess.html#replacin...:

          p1 = Popen(["dmesg"], stdout=PIPE)
          p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
          p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
          output = p2.communicate()[0]
      

      Of course, now, nobody has an hda, and dmesg is root-only. A more modern example is in http://canonical.org/~kragen/sw/dev3/whereroot.py:

          p1 = subprocess.Popen(["df"], stdout=subprocess.PIPE)
          p2 = subprocess.Popen(["grep", "/$"], stdin=p1.stdout, stdout=subprocess.PIPE)
          p1.stdout.close()
          return p2.communicate()[0]
      

      Note that the result here is a byte string, so if you want to print it out safely without the shell-like bugginess induced by Python's default character handling (what happens if the device name isn't valid UTF-8?), you have to do backflips with sys.stdout.buffer or UTF-8B.

      Python got a lot of things wrong, and it gets worse all the time, but for now spawning subprocesses is one of the things it got right. Although, unlike IIRC Tcl, it doesn't raise an exception by default if one of the commands fails.

      Apart from the semantics of the operations, you could of course desire a better notation for them. In Python you could maybe achieve something like

          (cmd(["df"]) | ["grep", "/$"]).output()
      

      but that is secondary to being able to safely handle arguments containing spaces and pipes and whatnot.

      12 replies →

There is posix_spawn(). Some operating systems even implement that as a system call (not Linux). Implementing that as a system call has the advantage that spawning a new process from a process that has huge memory mapping is fast, because the memory mappings don't need to be copied (yes, I know the memory is copy on write, but the mappings themselves have to be correctly copied with the information needed for copy on write).