← Back to context

Comment by amelius

19 hours ago

Yes. There are many reasons why one shouldn't use sh/bash for scripting.

But my main reason is that most scripts break when you call them with filenames that contain spaces. And they break spectacularly.

Counter reason in favor is that you can always count on it being there and working the same way. Perl is too out of fashion and python has too many versioning/library complexities.

You have to write the crappy sh script once but then you get simple, easy usage every time. (If you're revising the script frequently enough that sh/bash are the bottleneck, then what you have is a dev project and not a script, use a programming language).

You're not wrong, but there's fairly easy ways to deal with filenames containing spaces - usually just enclosing any variable use within double quotes will be sufficient. It's tricker to deal with filenames that contain things such as line breaks as that usually involves using null terminated filenames (null being the only character that is not allowed in filenames). e.g find . -type f -print0

  • You're not wrong, but at my place, our main repository does not permit cloning into a directory with spaces in it.

    Three factors conspire to make a bug:

      1. Someone decides to use a space
      2. We use Python
      3. macOS
    

    Say you clone into a directory with a space in it. We use Python, so thus our scripts are scripts in the Unix sense. (So, Python here is replacable with any scripting language that uses a shebang, so long as the rest of what comes after holds.) Some of our Python dependencies install executables; those necessarily start with a shebang:

      #!/usr/bin/env python3
    

    Note that space.

    Since we use Python virtualenvs,

      #!/home/bob/src/repo/.venv/bin/python3
    

    But … now what if the dir has a space?

      #!/home/bob/src/repo with a space/.venv/bin/python3
    

    Those look like arguments, now, to a shebang. Shebangs have no escaping mechanism.

    As I also discovered when I discovered this, the Python tooling checks for this! It will instead emit a polyglot!

      #!/bin/bash
    
      # <what follows in a bash/python polyglot>
      # the bash will find the right Python interpreter, and then re-exec this
      # script using that interpreter. The Python will skip the bash portion,
      # b/c of cleverness in the polyglot.
    

    Which is really quite clever, IMO. But, … it hits (2.). It execs bash, and worse, it is macOS's bash, and macOS's bash will corrupt^W remove for your safety! certain environment variables from the environment.

    Took me forever to figure out what was going on. So yeah … spaces in paths. Can't recommend them. Stuff breaks, and it breaks in weird and hard to debug ways.

    • If all of your scripts run in the same venv (for a given user), can you inject that into the PATH and rely on env just finding the right interpreter?

      I suppose it would also need env to be able to handle paths that have spaces in them.

    • What a headache!

      My practical view is to avoid spaces in directories and filenames, but to write scripts that handle them just fine (using BASH - I'm guilty of using it when more sane people would be using a proper language).

      My ideological view is that unix/POSIX filenames are allowed to use any character except for NULL, so tools should respect that and handle files/dirs correctly.

      I suppose for your usage, it'd be better to put the virtualenv directory into your path and then use #!/usr/bin/env python

      1 reply →

    • These are part of the rituals of learning how a system works, in the same way interns get tripped up at first when they discover ^S will hang an xterm, until ^Q frees it. If you're aware of the history of it, it makes perfect sense. Unix has a personality, and in this case the kernel needs to decide what executable to run before any shell is involved, so it deliberately avoids the complexity of quoting rules.

      I'd give this a try, works with any language:

        #!/usr/bin/env -S "/path/with spaces/my interpreter" --flag1 --flag2
      

      Only if my env didn't have -S support, I might consider a separate launch script like:

        #!/bin/sh
        exec "/path/with spaces/my interpreter" "$0" "$@"
      

      But most decent languages seems to have some way around the issue.

      Python

        #!/bin/sh
        """:"
        exec "/path/with spaces/my interpreter" "$0" "$@"
        ":"""
        # Python starts here
        print("ok")
      

      Ruby

        #!/bin/sh
        exec "/path/with spaces/ruby" -x "$0" "$@"
        #!ruby
        puts "ok"
      

      Node.js

        #!/bin/sh
        /* 2>/dev/null
        exec "/path/with spaces/node" "$0" "$@"
        */
        console.log("ok");
      

      Perl

        #!/bin/sh
        exec "/path/with spaces/perl" -x "$0" "$@"
        #!perl
        print "ok\n";
      

      Common Lisp (SBCL) / Scheme (e.g. Guile)

        #!/bin/sh
        #|
        exec "/path/with spaces/sbcl" --script "$0" "$@"
        |#
        (format t "ok~%")
      

      C

        #!/bin/sh
        #if 0
        exec "/path/with spaces/tcc" -run "$0" "$@"
        #endif
        
        #include <stdio.h>
        
        int main(int argc, char **argv)
        {
            puts("ok");
            return 0;
        }
      

      Racket

        #!/bin/sh
        #|
        exec "/path/with spaces/racket" "$0" "$@"
        |#
        #lang racket
        (displayln "ok")
      

      Haskell

        #!/bin/sh
        #if 0
        exec "/path/with spaces/runghc" -cpp "$0" "$@"
        #endif
        
        main :: IO ()
        main = putStrLn "ok"
      

      Ocaml (needs bash process substitution)

        #!/usr/bin/env bash
        exec "/path/with spaces/ocaml" -no-version /dev/fd/3 "$@" 3< <(tail -n +3 "$0")
        print_endline "ok";;