I'm “still afraid to use spaces in file names” years old

4 years ago (twitter.com)

I work on a complex desktop application, and it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names. If you do anything with subprocesses or path processing, it's absurdly easy to hit in a thousand different ways, over and over again.

Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

Forces you to deal with this properly, and immediately ensures that every automated test checks this case without you having to remember every time. Hasn't been particularly inconvenient, since I'm autocompleting it 99% of the time anyway, and I haven't shipped a single path parsing bug since.

  • Seems like MS had the same idea according to an answer in the link:

    > Microsoft intentionally made programs install to C:\Program Files on Windows 95+ to force programmers to deal with spaces in filenames.

    • I wish they did "User Files" instead of "Users" too, because so much software breaks on the home area having a space in it.

      Not least, it makes writing scripts for various shells and getting the quoting rules right an absolute pain as well...

      111 replies →

    • It not only keeps people on their toes due to the whitespace. The folder name is even localized. E.g. with german settings there is C:\Programme and c:\Programme (x86).

      1 reply →

    • I wonder how much global work could have been saved if Microsoft also provided a covered interface for all paths in the system. Not sure if there is any, but one good implementation might save thousands of poor implementations required to handle it.

      2 replies →

    • On the other hand their case sensitivity behaviour means that “cross-platform” Java applications can break if they are run on a non-windows platform where opening files is case sensitive (unlike on windows)

      1 reply →

    • I just wish they had a decent way to execute programs with arguments that might include spaces. But no, every program can do argument delineation differently.

      1 reply →

    • I know that at least like, idk like 3-5 years ago, when I had gotten a new windows laptop (windows 7 or 8 I think), setting the main account to have the name "" (without the quotes), caused some problems with the basic functioning, including, I think, with some pre-installed programs,

      So, some things were still being handled not quite right (whether that's because it shouldn't be allowed to be the username, or because programs should handle it being in the path, I'm not sure, but probably one of those.)

    • And then to really mess you up and ensure you handle parens properly, threw “(x86)” into the mix. (A real pain on some REPLs as well as dealing with environment variables).

    • Except for programs that were too old / obscure to fix I guess. I think at least the Symbian Development Kit was such that builds would fail with strange errors unless you installed it in any other path than the default immediate subdirectory of C:\, let alone under "Program Files".

      1 reply →

    • Funny, in the Italian Win9x it is C:\Programmi, which I always thought was more convenient because of the lack of spaces :)

    • Shame it wasn't

      > C:\P̷̧̽r̸̬͘ŏ̵̮g̷̜͘r̸̦̋a̴͎̒m̶̲̈́ ̷̠̉F̵͇̈ĩ̴̫l̶̨͗ë̵̦s̸͚͆\

  • > Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

    A former co-worker changed his name in our auth system to include an apostrophe, so that whenever we handled names wrong he'd find it.

    • I set my nickname to U+FFFD at one point in one work system, resulting in a variety of bug reports and concerned emails. I think I dropped it since it was generating false reports from people who didn't check what character the page contained before reporting it.

    • To have such thoughtful coworkers. On an old team I had two coworkers named Chris and once in a blue moon when they reviewed each other code master would start crashing because one of them accidentally left in an absolute path starting with "/home/chris/".

    • A related too for CI: change the system time to be a time zone that is during your work hours in a different day already than UTC. Really helped getting failures earlier than 4pm PST.

      5 replies →

    • One of the systems I built is being used by a group of younger people. I included an emoji in the superuser account name, just to make sure it would work. And to remind me to think more broadly about user input.

    • I've used to have a space in my user name and even contemplated to add a bit of non-1252 Unicode. You find a lot of issues, but unfortunately often in tools you have little control over and end up not being able to work effectively at times. It ended up being more frustrating than helpful.

    • I add a Japanese character into any .py, .js and .html file to ensure that Unicode is working properly through the entire chain. Mostly in form of a variable which gets passed along, even in URL parameters.

    • my test accounts always have emojis + accents + other weird characters.

      it keeps everybody on their toes lol.

    • the proper name of the glorious sultan of slack, j. r. "bob" dobbs, has the quotation marks and therefore is a great subject for this

  • > it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names

    If you consider spaces “unusual” I would say you haven’t encountered a single average user in your lifetime. Spaces in file-names is the single most common thing people have, outside programming environments.

    As a x-plat developer, the only platform where I (still) regularly encounter these kind of bugs are platforms where solving problems through scripting is common, like Linux, where the primary means of operation is through stringly-typed statements getting parsed and processed in a untyped-fashion. It's not very reliable.

    On Windows people more often use “real APIs” (because scripting doesn't really work as well), but then these problems just goes away.

    Pros and cons, I guess.

    • It's especially funny that it affects Linux so much. Most file systems allow everything except `/` and NULL in file names. Early AT&T UNIX even allowed NULLs! POSIX shells use the IFS variable to perform field splitting, and it defaults to <space>, <tab>, and <newline>. The choice to perform field splitting by default (particularly with spaces in the default IFS set) has caused no end of headaches for developers and users.

  • It doesn't even have to be complex, often basic automation tasks fail with spaces and special characters. Honestly, treating a file system like a natural language processor is a bad idea. Besides at this point with how digital we have all become who can't understand...

    thisismyconfig.txt vs this is my config.txt or this_is_my_config.txt

    ...i've forced myself to stop using spaces, character, and even cap. They are all constructs that provide minimal value for the extra complexity.

    • I'm similar, but I would like to support labels intended for humans, along with various translations, as metadata on top of e.g. filesystem path components.

      1 reply →

    • Why stop there. A computer works more efficiently with numbers rather than strings, so let’s just give each file a number instead of a string. Besides, at this point with how digital we have all become who can’t understand… But wait, that already exists and is called an inode.

      A file system has a human interface and a computer interface. Don’t mix them. Let users give file names in whichever way they please.

    • > treating a file system like a natural language processor is a bad idea

      could you please explain what you mean by that?

  • My favorite filename special character bug was when I implemented CD ripping in 2005, and one of our beta testers ripped a CD with a song called "Have You Ever?". My code wasn't prepared to filter out the question mark on Windows.

    • I just hit the one where an album folder ends in a period. Rsync copies every time because the period is dropped by the filesystem silently. :-/

  • > Pro tip: rename your development directory

    I changed my username to not contain a space because it was too annoying to deal with all the random dev tools breaking. The worst offender was probably npx on Windows [1] (resolved after four years by deprecating npx), but it was far from the only one (though the JS ecosystem was somehow the worst in this regard of all languages I worked with).

    1: https://github.com/zkat/npx/issues/100

    • Same, even I had to rename my user folder to not have a space because so many tools were breaking.

  • > other unusual characters in file names

    Saw a few hacks where malware authors used the RTL feature (which is baked into Windows) to obfuscate file extensions. It looked like .exe.innocuous-document.docx, but was actually .docx.innocuous-document.exe

    • This exact vulnerability in most modern code editors just made the rounds, allowing smuggling malicious code right through review.

  • My Mac is formatted case sensitive when the default is case insensitive. This will also catch a ton of import related bugs.

    League of legends doesn’t run until I sed files for instance.

    • I once returned a printer because the Mac driver and support software expected and enforced case insensitive access and basically couldn't install properly on my case-sensitive HFS+ volume. It half installed and blatantly just didn't work in any way when installed.

      1 reply →

    • I have coworkers on Mac that write node/JS code. Every once in awhile I'd pull down the latest code and it wouldn't run. I'm on Linux.

      Sure enough, they had SomeFile and were importing Somefile and it works fine on Mac but not on Linux (which, of course, is what our production servers use). It amazes me that "works fine on my machine" is still a thing when I definitely worked at companies that solved this back in the 2000s. It was solved. It was done. Then devs became enamored with running everything locally. Even dozens of microservices or databases. Even though JS is fairly isolated, you still have NPM packages that need built against the local OS and C/C++ library and compilers, etc. Which also has caused issues in the past.

      8 replies →

    • I also enjoyed doing that, but had to make a DMG just for Steam because it straight-up refuses to run on a case sensitive FS (that's true on Windows, also, which I suspect is how we all got here). I think the most recent Steam versions either caught wind of my trickery or -- more likely -- run something from $HOME/Library/SomethingOrOther and thus the work-around it no longer works

      When I got a new Mac, I just gave up and acquiesced to the case-retentive world :-(

    • Circa Y2k, I learned that the OSX Palm Pilot software didn't work with case sensitive. I've since given up and stuck with the default. (I'm anti-case folding in general, because of the ambiguity.)

  • I maintain a similar system, where a variety of companies submit files that get processed through multiple services - it is astounding how ridiculous people’s naming of files can be; spaces are the least concerning!

  • > anything with subprocesses

    I'm begging software developers to stop using subprocess APIs that take a string argument (system(), child_process.exec(), Process.Start(string)) and start using subprocess APIs that take an array of arguments (execvp(), child_process.execFile(), Process.Start(string, IEnumerable<string>).)

  • While I agree that we should do this in the ideal world, doing so will inevitably break other necessary tools so it is unworkable for me :(

  • And add a emoji, a character in a right to left language ( א) and perhaps 太. Maybe italicize one of those too...

  • Spaces are a pain in the ass when you're using CLI so I'd rather enforce a no space policy

    • Most shells will behave just fine if you put a quote (single or double) before anything that has a space.

      A small extra step but something you get used to if you spend a lot of time in the cli.

      1 reply →

  • I don't know if it's still a problem, but it used to break Python virtualenv badly. If your working directory had a space anywhere in the path, it would throw a huge fit and not work. Which is problematic when the expected name for a Mac's boot drive is "Macintosh HD" (if you ever had a reason to run a virtualenv outside of your home directory).

  • Sometimes / works as a path separator in Windows, sometimes it doesn't. It's not predictable.

    I never use / on Windows as a result.

    • The only common place where it doesn't work is in CMD for executing programs and as arguments for built-in commands. Everything else goes directly to the relevant APIs which don't care about / or \.

      These days using CMD instead of PowerShell should be rare enough and PowerShell certainly doesn't mind the slashes.

  • It's easy to tell users to make a folder with no spaces if you're setting up a global path, however if you have an application that runs in user directories things can become painful fast. Changing your user name is a pain and can leave things inconsistent, but having to handle all the variations in people's names with spaces, punctuation, international characters, can just be mind boggling.

  • I did something similar on accident. I used to keep all my development work synced with Dropbox and I had a work and a personal account. So any of my own projects would have /Dropbox (Personal)/ in the path which did catch some bugs. Dropbox renamed my folder to "Dropbox (Personal)" automatically when connecting a work account.

  • More importantly than your source files, put your testing data on such a path as well. Nobody uses absolute paths in testing so it doesn't matter how many spaces your absolute path has if your input is "./tests/file1". Put those files in a folder with spaces too and throw in a unicode character for good measure.

  • > Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

    The problem with that is that YOUR code may handle it, but your tooling may not. If my code formatter break on spaces, I'm not going to change the formatter.

  • Somewhat related to injecting unusual characters, in my experience in localization efforts:

    Inject a Turkish 'I'. I don't know how to type or paste it here, but picture an English lower case 'i' that is upper case. It is a splendid way among many to shake out some loc bugs.

  • Late '90s I worked on Java software that got installed on several Unix platforms, including Linux for IBM mainframes. When you deal with the default en/de-coding of Unicode to EBCDIC you never have trouble with Java byte encodings ever again.

  • Someone should provide the OneDrive/SharePoint people some of this religion.

    Mysterious character requirements that do not conform with Microsoft’s OS limits, limits on tbe fully qualified pathname length, etc.

  • Even capitalization is a pain in the ass thanks to how OSes treat file names. I pretty much stick with either `file-name.ext` or `file_name.ext` exclusively now.

  • Today I learned that You cannot install Tailscale on windows if installer is inside path with non-latin chars.

  • In that case, be thorough and insert a Chinese and an Arabic character to enforce a Unicode check.

  • See the recent article about unicode invisible glyphs in JavaScript or bash.

    Naming freedom needs a stdlib module

  • Better solution: only allow ASCII, maybe dashes, and up to twelve characters. Problem solved.

    Enforce this in LDAP.

    Strict convention is better than flexibility and predicting obscure edge cases that can fail.

    • In my case, and for many people writing desktop software, and for absolutely everybody writing open-source tools or libraries, unfortunately you can't control the environment.

      Non-ASCII paths are extremely common (e.g. the user's home directory on Windows, for the large majority of users outside the English-speaking world) and spaces, punctuation and weirder characters will definitely happen when you least expect it.

      Yes if you can avoid it then absolutely that's great, but I don't think most people can.

      It's also not usually very difficult to deal with, as long as you actually spot the issue in the first place.

    • only allow ASCII, maybe dashes, and up to twelve characters. Problem solved

      ...and only hire people from the exact same background as you, who will never have unusual characters or accents in their name. And also make sure not to have any users who aren't exactly like you, and conform to this very narrow requirement. Surely, excluding 90% of the world won't hurt revenue in any way.

      4 replies →

    • Ugh, we have the 15 character Active Directory limit now with hostnames, and a previous IT administration has imposed a convention that every name had to follow [prod|dev]-[ph|vm]-[service]-[nn]. So basically every production service is prod-vm-owtf-01— you get exactly four characters to actually describe what the machine does. Works great when the service is "jira" or "wiki", but there are a lot that are pretty mystical-sounding, like jkns, jwrk, cntr, hrbr, etc, where you kind of just have to know.

      4 replies →

    • Ah, that's the he enterprise edition.

      But then your program will crash hard and unexpectedly when a user decides to save under "~/house plans" or ~/Téléchargements.

      I think it's better to exercise this in CI, that's what CI is for.

  • there are things you cant do in .net that you need the old Registry commands for and those don't accept spaces

  • And yet OneDrive WP t allow fir spaces before or after a file name.

    • I spent hours trying to figure out why an entire folder suddenly stopped syncing. Turns out I accidentally added a hidden space to the end of a folder name.

      1 reply →

  • > Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

    This will also break any code in external tools that are called during the builds of your application and do not handle spaces correctly for whatever reason, thus making it so that you won't be able to successfully finish the build.

    Then again, you probably shouldn't be relying on technologies like that, but when you're struggling to keep an old enterprise system alive, causing yourself more problems is not necessarily what you should do.

    Still a good idea in most cases, though.

I have an overly-aggressive function in my .bashrc to rename all files in the current directory:

  # Rename all files in a directory
  rn() {
    rename "s/ /-/g" *
    rename "s/_/-/g" *
    rename "s/–/-/g" *
    rename "s/://g" *
    rename "s/\(//g" *
    rename "s/\)//g" *
    rename "s/\[//g" *
    rename "s/\]//g" *
    rename 's/"//g' *
    rename "s/'//g" *
    rename "s/,//g" *
    rename "y/A-Z/a-z/" *
    rename "s/---/--/g" *
    rename "s/-‎--/--/g" *
  }

I use this all the time, especially when I download files.

  • Word of warning from hard experience: rn is a really dangerous thing to name a function because it is one char away from rm.

  • https://github.com/dharple/detox is a nice tool for this. Sane defaults but configurable.

    In addition to CLI I use it from emacs dired-mode too:

        (defun my-dired-detox ()
          (interactive)
          (dired-do-shell-command "detox" nil (dired-get-marked-files))
          (revert-buffer))
    

    I bind it to "_" in dired-mode.

  • Overly aggresive is right! I don't know if this is genius or deranged! I'm leaning towards genius and stealing the idea.

    By the way: what's your beef with en dashes? I mean, if it was "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but why specifically en dashes and not em dashes?

    • > By the way: what's your beef with en dashes?

      Of all the changes in that list, removing the character that doesn't appear on a standard keyboard seems like the least controversial...

      5 replies →

    • I totally agree that for some people, this could be a terrible command to have around. However, I know that it has been working for me for about 8+ years or so. I almost always run in in my ~/Downloads folder on files that I don't really care about. I download a lot of academic papers and books, and this just saves me a lot of time to put files in the format I like: author--paper-title.pdf. And that's part of the reason why I make all of the dashes the same, so if I'm opening something by an author, I can easily autocomplete and not have to remember how to make other sorts of dashes on the command line.

      1 reply →

  • I use this snippet, to change spaces to underscore for directories and files in the current directory and below. Haven't made it a function yet, but should. I got it from stack overflow or somewhere, but no attribution. Thanks to whoever did it first:

       find . -depth -name '* *' | while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

  • Thanks to all the comments in this threads, I now have "sudo apt install rename detox" in my install script, and:

        normalize_names() {
            rename "s/-/_/g" *
            detox -s lower *
        }
    

    in my .bashrc.

    I've thrown some edge cases at it, and it handles it super well. It deals with consecutive "_", remove leading garbage, normalize unicode, and even prevents naming conflicts by opting out early.

    Thanks you.

  • If you're a developer you're doing yourself a big disservice by not learning how to deal with special characters.

    • I agree. I am a developer and I know how to deal with special characters. But this isn't something I use professionally. I just prefer not to have to deal with special characters in the pdfs, m4as, txts, and other files that I use on a daily basis. When I write papers, I'll write ū or Ñ or ç or whatever (incidentally, I have a lot of shortcuts in my .vimrc for those). I would not say I am "afraid" to use spaces in filenames, but I get a certain satisfaction storing academic papers in the author--paper-title.pdf format and my notes in author--paper-title.md because it helps me find things.

  • Nice but how do you prevent overwrites? What about directories/folders and the files in that directory/folder?

    I have:

      Movie Bla (2020)
        Movie Bla (2020).mp4
    

    But also:

      Movie_Bla_(2020)
        Movie_Bla_(2020).mp4
        Movie_Bla_(2020).srt
    

    Would not like to lose files like the the srt.

    • Yeah, sometimes I end up renaming things I don't want to, but it really doesn't happen all that often. And sometimes I throw caution to the wind, add some excitement to my life, and rename a bunch of files (not for anything professional) in some really old directory and hope I don't break anything. But I'm not aiming for perfect with this comment. I just mentioned in another comment, but the vast majority of times I run this is in my ~/Downloads folder on files I don't really worry about breaking.

  • Surely you must run into conflicts now and then?

    • That's the most beautiful part! After running this script there are no more conflicts, because it just silently overwrites all but one version of the "cleaned" filename.

      (Also—that entire function is super inefficient and could be replaced with a single invocation of "rename".)

      2 replies →

  • I wonder if rename has an -e flag like sed. It might be worth baking this into one monolithic regex if you call this often

Define "space". Is the Hangul filler we talked about yesterday a spacing character? Is the zero-width non-breaking space a spacing character? What about the typographic spacing characters?

You should better be very afraid of using spaces in filenames.

You should do everything you can to support them but you have to know you'll invariably encounter countless cases where you'll have this or that tool that won't work properly with them.

I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).

FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.

Every Git repository in the world has that example: let that sink in.

  • > FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.

    I use Git for documents too, not only code. Why shouldn't I use my native language?

  • > I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).

    I have an Android phone and I tell MusicBrainz Picard to save all files with ASCII-only names and Windows-compatible names for the ones that get sent over to the phone. Basically for this reason. Sometimes it's players on Android itself, but even more frequently, whatever bluetooth radio I'm connected to freaking out with non-ASCII characters.

    • What do you mean, display garbage?

      L'imp?ratrice? L'imp�ratrice? L'impératrice? L'imp‚ratrice? L'impÚratrice?

  • You get all those space characters working and then some jerk comes along and uploads a file like this: ŗ̶̧̢͓̳͍͙͔̳̻̥͉̭͓̫̟͍̞̭͉͓͉̮̹͍͚̳̹̬͉͚̰͈̘̐̊̾̈̀̒͒̀͛̓̋̔͊̏͘̚ę̴̨̛̣͙̤̟̬̩̟͙͖̥̹̱̱̊͑͗̇̇͛̆̈́̃͋̓̀̔̍̍̌̐͊̎̓̅̀̕ͅģ̴̹̜̘͍̱̑͐̉̌̐̄̊͛̎́̐̌̅̈́͂͑̈́̋̔͂̊̊̒̒̔͛͆̚͘̕͠e̶̙͕̫̳̘͐̾́̑͆̓͂̿͊̊̍͛͐̌̆͗̌̅̅̔͊̂͛͗̅̕͝͝͝͝x̵̢̧̦̫͖̝̥̹͓̬͖̤̩͚̝̫̋̃̅̈́̆͋̌͑́̎̈́̊̾͒̀̒̎̓͛͊̿̓͊̀̍͐̆̚͝͝-̴̨̮̯͖͖̠̜̲̪͕̘͈͖̮̈́̓̐̃́̅̄̏́̍̉̐͌́̔̓̄͋͗̐̕͜͝ţ̴̢̧̖̗͖̞̮̫̦̼̝̺̼̱̳͓͉̜̟̤̲͖̻͙́̌̈̌̈͆̾̄͊̿̏̓͗̈́̕͜ͅh̶̢̧̨̥̭̼̟̣͖̯̗̤̖̙͉͕̙͎̰̠̝̖͈̻͙̪̮̘̯̻̼͕͓̖̣͈̽́͊̎͐͌̆̍̎̏̿͐̒́͋͑̍̿̎͆̑͆̄͂̀͐̄͑̀͗̿̽̎̾̊̕͝͝͝͝͝ͅi̴͚͈͍̫̮̝̣͖͉͓̯̠̙̭̟̖̘̾̓̄̈́̒̏̽̆̉̿͛̀́̃̋̒̈́͋̂̇̈́͛̕͜͠͠͝ͅs̶͇̖̳̞͉̱̞͓̖͔͔͍̗͇̖̮̹̅͊̔͋͊̈́̎̐̆̋̒̀̍̕͜ͅ.̴̧͎͇̰͉̼̱̰̦̟̑̋̏͌̍͊͑̄̀͌́̆̓͛̒̆̾̉͐̄̂̈́͆̒̃͗̐̂̎̈́̈͛̿́͛̾̚͘͜͝͝ͅȩ̷̡̲̪̱̪̥̳͍̼̰̘̗̹͙͙͓̣̟̩̥̥̖̠̪̮̹̞̥̻͎͖͍̯̂͑̏̑̆̍͋̎͛̅̑̑̏̎̓̀̓̒̈́͊͌̀̈́̒̌͐͂͛̊̍̐͂́̔̌̾͐̈́̋̇̏̚͜͝͝͝͠ͅx̶̧̛͚̗̜̪͍͖̘̙͎͚͇͙̬̱̟̭͓̺̙͍̖̱͚̣̘̪̭͔͔̮͎̬̪̤̹̟͔̩͍̬͕͔̩͐̈́̒̂͛̂̈̀̿̍̔̓̓̀̃̍͆̈́̍̓̌͐̈́̾̇̎̑͌͒̄̆̿̍͆̅͗͆͘͠͝͝ͅͅͅe̷̢̡̡̨̧̛͕͚̬̮̞̥̼͍͔̝̟̝̯͈̟̥͖̱̹̣̩̼̩̅̌͌̑̎̐̀̽̏́͐̋̏̎̎͛͌̀̊͊͒̑͌̎̎̑͊̌̉͆̾̚͘̚͜͠͠͠͝͝ͅͅͅ

    • For anyone who is curious (and acolytes of Zalgo): "In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character. So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model."

      [https://stackoverflow.com/questions/6579844/how-does-zalgo-t...]

      1 reply →

    • This legitimately made me laugh out loud in my office.

      The characters reach up off the screen as I reply to this. They overlay the comment above you. Amazing. How?

      6 replies →

    • Until now, I haven't actually thought of what would happen if zalgotext occurred anywhere other than a web browser. Looking forward to the five minutes of fun with the file manager and whatnot.

    • 768 characters is too long for macOS it seems. (References online say HFS+ has a limit of 255 UTF-16 characters. Didn't find anything for APFS immediately... edit: same for APFS)

I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string - it's a sequence of path components and should be treated as such by our APIs. A path should be parsed once - on user input - and then used in its "sequence form" throughout the software stack.

And "path component" is not an arbitrary string either - e.g. appending a path component to the path should first require converting/parsing the string into the path component, and only if that's successful appending it to the path.

  • "Path is not a string - it's a sequence of path components and should be treated as such by our APIs."

    For maximum correctness, you want to turn it into a file handle as soon as possible, and do all operations through the variations of the file functions that end in "at", like: https://linux.die.net/man/2/openat

    The downside of this approach is that you still technically have to carry the path around with you if you ever want to present it back to the user, because once you have a directory handle, you can get back to the root directory easily enough by following parent links and seeing what directories you end up in, but that may not be what the user "thinks" the path is, and they want to see their path, not a canonicalized one. And they're mostly right. And it's not easy to correctly track changes to their intended path from this basis either.

    Basically, I don't know of a really solid, 100% correct way to handle this with any reasonable degree of effort.

    • > For maximum correctness, you want to turn it into a file handle as soon as possible

      That's not right. You want to resolve a file/folder path to a file/folder at the exact point it makes sense.

      It's a problem if you're using a path when you wanted the file. The file can be switched/modified out from underneath you.

      It's also a problem if you've got the file when you only wanted a reference. Now you can't simply switch/modify the file independent of the reference. E.g., maybe you want config file changes to take effect immediately and transparently.

      You can also have the hybrid case, e.g., where you want the folder directly, but have a relative path to a file that is resolved late.

      If you're unsure, I'd err on the side of late resolution.

    • "you want to turn it into a file handle as soon as possible"

      But no sooner.

      For example, I've run into problems where I'm configuring program A server to talk to file location B... but I don't have access to file location B. But the client-side library for talking to the server tries to convert location B into a file handle and then freaks out because I can't access it. When I don't want to access it. I want that program to serve it.

      If it was using simple "path" objects that didn't confirm that I have access to the path, everything would be hunky dory. But because it tried to convert it into a file handle unnecessarily, I get blocked.

    • Another inconvenience with this approach is that you can keep thousands of paths in memory no problem. But thousands of FDs may cause you to exceed per-process limits.

    • This goes for most instances of user input. Timestamps is the other common one people get wrong. I've even seen programs that pass around timestamps as strings in multiple formats and as integers (Unix time).

      5 replies →

    • > For maximum correctness, you want to turn it into a file handle as soon as possible

      This is why I get stressed out when I see paths turned into special objects encoding separators and such.

      It tells me the path is living for way too long compared to the file handle.

      I only want to see path-specific objects if we're modifying the path, and even then I want that to happen as late as possible.

    • Why not just hold onto both? The users representation and the file handle. Only ever "display" the representation, while you do all operations on the handle. (Not trying to be sarcastic, just curious).

  • > I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string

    I guess that depends on what you mean by "string". `open` and `fopen` need a char* path to open a file. Whatever fancy Path abstraction you use eventually becomes a char* string, because that's what the kernel needs.

  • Strings following certain rules are entirely valid representations of paths, just like sequences of path components in the chosen language/framework are. Similarly, the sequences of bits that make up the sequences of your language/framework in memory are an entirely valid representation of said sequences of components.

    Yes, paths have structure, but saying "a path is not a string" is equivalent of saying "C source code is not a string". Both are strings, and both are something else, represented by strings according to rules. Different internal representations have different advantages and disadvantages. I fully agree that for things such as "adding components" an internal sequence/list representation is better, but strings can pass arbitrary IPC or even ABI boundaries much easier for example. (And you wouldn't bat an eye for example when you see FQDNs like "www.google.com" passed as a string instead of as ["www","google","com"] because the string representation works pretty well.)

    • C source code and paths are both representable by strings, true, but the fact that they're not actually strings is still important, because most people don't know that, and in the case of paths that leads to a lot of edge cases (in the case of source code it leads to a bunch of inefficient and weak tooling, which isn't quite as bad).

      Because neither are strings, their native representation shouldn't be such - it should be something structured, and only when necessary (IPC, FFI, serdes) be serialized into a string representation. This would save people a lot of time and effort.

      1 reply →

  • POSIX "Fully portable filenames" allow all characters except 0x2F (/) and 0x00 (NULL). That means file names can include line feeds, backspaces, EOF, etc.

    "This is `a

    perfectly vali'd.\010! file name\377, despite the weirdness"

  • things like this are why the Unix philosophy is so bad.

    text processing is hard if you must support Unicode, and that means every Unix command line tool must implement or employ a text processor to handle input. it would be much easier if objects were passed back and forth. PowerShell got this right.

I'm hardly afraid but I just think it's poor ergonomics. Same as the move from

   xset m 0 0

to

    xinput --set-prop 'pointer:Logitech USB Receiver' 'libinput Accel Profile Enabled' 0, 1

Everything seems to be going this way in Linux land. Longer names, harder to type names, camelcase names, spaces... I'm looking forward to an OS that treats command line ergonomics as a first class feature and where camelcase & spaces are verboten.

  • Well, if you think that's bad, behold the recent trend in network interface names on Linux.

    We started out with 'eth0', 'eth1', etc. Which adapter was which could change when adding and removing a network card. That was bad, so that prompted the evolution.

    Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar variations. These are supposedly more stable across device changes. As it turns out, it wasn't.

    But wait, there is more! Now we have the "predictable names" scheme that produces interface names that are even longer, and not even slightly easier to remember.

    Read about the whole sorry saga here:

    https://wiki.debian.org/NetworkInterfaceName

    I do get that it is not an easy problem to solve, especially in the face of removable network interfaces (like USB Ethernet / WLAN). But surely this is not the best we can do.

    • I was actually ranting about this on IRC last night (yeah now my laptop has two enp* interfaces and enx[MAC])..

      One thing I like about OpenBSD is that buses are scanned and drivers probe in order and there's no race between drivers coming up. Unless your hardware is physically tampered with or broken, all interfaces come up with the same name across reboots. Linux isn't like that (even if you don't touch your hardware, interfaces could swap across reboots), so you need to do something about it.

      As is typical on Linux, the default is unergonomic and if you want something nice, you're on your own to make it so.

      If you already have userspace daemons responsible for device insertion and naming, it really wouldn't have been so hard for it to e.g. automatically add a config file / database entry for each interface the first time is seen. So the devices that came up as eth0 and eth1 are still eth0 and eth1 on the next boot; if I unplug eth0 and add a new card, the new one would be eth2 because eth0 is still reserved for the first card I had.

      1 reply →

    • > These are supposedly more stable across device changes.

      No. These are stable across reboots. The old eth? weren't. And yes, that had been a PITA.

    • If netwok interfaces were files we could just have both short names and stable names, like what we have for block devices.

  • I find this attitude misguided. More descriptive names are more ergonomic for things you only use rarely but they need to be combined with much better autocompletion than most shells provide by default.

    • You state that as if that were objective.. but that's not my subjective experience at all. Somehow I have a hard time remembering these long names, (is it --conf or --config or --config-file or --config-path? -c would've done it for me. --set or --set-prop or --set-property or --prop or --property?), and I need to look them up in a man page anyway, and I make more typos typing them, and shell completion rarely works well if at all. I also find it harder to read and edit long lines that wrap.

      Somehow these short letters stick much better for me, and the effort for finding them in the manual is the same, although in case of extra complexity as with xinput, it's even worse with the long names. I don't use either command often, but it's hard to forget xset m. The only thing I remember about xinput is that it's a horribly long lithany of things which I need to look up every time, and the syntax still feels weird.

      17 replies →

    • Spaces don't make anything more descriptive, they just cause completely unnecessary quoting and escaping hassle.

      The amount of time that has been wasted by Windows using "C:\Program Files" instead of "C:\Program_Files" far outweighs any highly questionable aesthetic benefit IMO.

      1 reply →

    • Short option for interactive terminal. Long option in automation.

      I’ll be damned if I have to remember or lookup what -n means to some obscure program, when reading someone else’s script. Exception given for super common tools where everybody knows like ls -la.

      With the disclaimer that shell scripts, especially ls, aren’t exactly suitable for reliable automation in the first place.

  • Cue nmcli (CLI for Gnome's NetworkManager) which uses UUIDs for everything and (at least a while ago) did not accept partial-but-unique UUIDs. Basically goes "nmcli connection up 5095665a-d82c-4ae6-8964-283623387941".

    • By this point, I'm pretty sure there are people at gnome who compete to see who will make the stupidest suggestion that gets put in production.

      2 replies →

    • Weird, I haven't had to do this. Most(/all?) connections have nice names you can see with `nmcli c`... and so I can do `nmcli c up id DroidNet` and that's pretty dang nice. Pretty sure this worked with Ubuntu 14.04 (though, nmcli has gotten much more featureful since then)

      (The ability to shorthand connection->c and similar is great, too; obviously not unique to nmcli)

  • I could infer a lot about the second and what those params mean and what they do.

    The first one is some magical incantation.

    • Sure. One could also make "move-down-one-line" be the incantation to move the cursor down a line in vi, but I prefer j.

      Ergonomics isn't all about making everything self-descriptive for someone seeing the thing for the first time. It's about making things comfortable to actually use. If it's so long and complicated that you can't even remember how to do it, it's not very comfortable to use. Even if I could remember, xset m 0 0 is still far more comfortable.

      And fwiw you still don't know what 0, 1 in accel profile do; you need to look that up or take a wild guess, and if you want to use that command, you'll also have to know how to look up the device because chances are yours is not the same as mine. So it's not any less magical in the end, just more verbose.

      The "cool" thing about the xinput command is that you don't even find accel profile in the man page. You gotta look elsewhere if you want to understand what it is and what it does and what the parameters are.

      xset m? Yes, that is documented in the man page.

      5 replies →

    • Another interpretation is:

      On the first, you think you know what it does, but you're not sure. So maybe it gets looked up.

      On the second, you know you don't know what it does. You so know to look it up.

      Personally, I'll take the second. Assumptions during debugging are dangerous things.

    • But which case should software interfaces optimize for? Ergonomics of someone who uses a tool frequently, or interpretability for casual by-standers of some out-of-context shell command?

  • The problem is we're optimizing for "easy to learn" rather than "easy to use".

    • That may be a part of the problem but honestly I don't feel like all these new crazy interfaces are easy to learn either. I mean how do you come up with the lithany xinput calls for? You need to understand the syntax for specifying a device. You need to know that you're to set a libinput property, and you need to know the name of that property, and it's not documented in xinput man page, and of course you need to know the values to pass which again are not documented in xinput man page. You can play with --list-props and then take your search elsewhere because it is completely opaque and doesn't explain what the properties actually do.

      I suspect the number of people who figured all that out without having to find it by googling / arch wiki / whatever is very very low.

      Now I'm not gonna say xset is the easiest interface to figure out, but the syntax for setting mouse acceleration is right there in the synopsis, and if you search down the man page, you'll learn a little more (and also if you just run xset without arguments, it'll tell you how to set mouse acceleration). It might not be the best designed tool but it's something I learned back in the day as a teenager just by looking at the man page.

      I think the real issue is that people nowadays are designing these interfaces to be consumed by interactive configuration tools, GUI apps, and desktop environments; they're more dynamic, more complex, more flexible, but not easier to figure out, not for you on the command line. The command line is just a last resort. Second class citizen if you will.

      3 replies →

    • In a world of broken promises and tool churn, minimizing tooling investment isn't laziness, it's a defense mechanism.

      This is a lesson I had to learn the hard way, multiple times.

      1 reply →

    • On some level it makes sense. The problem with the command line is familiarity.

      How often do you reach for iptables? If you're like myself, and most home/desktop users, then probably once in a blue moon to set it up and then you leave it alone. But a system admin? Maybe they touch it a few times a week or month. Every time I use iptables I have to relearn how Linux networking works.

      Similarly, the xset/xinput thing. When I need those tools I just create a script or throw it in .bashrc. I adjust the settings once and will not touch them again for a couple years. It makes sense to have long parameters that are readable. I can look at my .bashrc and see exactly what device is getting adjusted.

  • Long option names are more descriptive, more easily distinguished, and easier to remember. Your shell should be intelligent enough to provide tab completion for option names, assuming it is configured to.

    • > Long option names are ... easier to remember ... Your shell should be intelligent enough to provide tab completion

      They are so easy to remember that you need to configure your shell to remember them for you?

      1 reply →

    • >Your shell should be intelligent enough to provide tab completion for option names, assuming it is configured to.

      Wait, are you saying that I need to change my shell or config to make up for another tool's poor design?

      No, thanks.

    • Long option names are more difficult to remember because a long option name can be spelled multiple ways and it is difficult to remember which spelling is correct.

    • IMO, powershell got it right. Yeah, it’s syntax is strange, but it has standard flag usage with proper autocomplete, and you can shorten any flag the way you want (eg. fuzzy match) if it is unambiguous.

  • These changes are meant to make it easier to read and understand command-line incantations (and to make them more explicit, which is always good), because the command-line paradigm, being text-based, imposes an unavoidable trade-off between ergonomics and understandability/ease-of-use. It sounds like you prefer ergonomics - although I wouldn't be surprised if most users would prefer ease-of-use.

    Of course, if one doesn't write a CLI to begin with, this trade-off doesn't exist - you can have your cake and eat it too.

  • imho, the fundamental problem is using space as a delimiter. Also, case-sensitivity is a disaster for ergonomics.

    If you had comma-delimiting like in an algol-derived language, you wouldn't need to quote things with spaces.

    edit: also, code is read more times than it is written, so optimizing for readability over brevity is generally a good move.

  • I've a feeling you will hate powershell

    • Needlessly long parameter/command names and the bizarre insistence on capital letters are the #1 and #2 reasons I detest PowerShell. Like GP, I resent that Linux tools are moving in that direction.

I am also that age, and kebab-case is the best case for filenames.

2021-01-01-some-important-document.pdf gives me the warm fuzzies. On the off chance that some more differentiation is needed, throw in an underscore and a whole new world opens up

One of the main reasons why Windows used "Program Files" and "Documents and Settings" was to force the programs (and programmers) to deal with paths with spaces. And you know, for the most part it kinda, more or less worked out although of course even today you will find programs that ask you to install them in a folder without spaces in the path.

  • And that was a good idea, if only Microsoft also fixed the CreateProcess function, Windows would be somewhat sane in this regard. But somehow nobody seemed to think of it. Seriously, look at it:

    https://docs.microsoft.com/en-us/windows/win32/api/processth...

    The arguments are a single string. So you want to pass parameters with spaces in them? You've got to add quotes and stuff all of that into a single string. Instead of doing it in a more sane manner, like oh, the arguments to main().

    • The root cause is that argv isn't a first-class citizen like on linux, but an abstraction. The kernel only cares about a single string argument. If you use main instead of WinMain, the CRT will transform the single string into an argv for you.

      Oh and cmd.exe uses a different escaping scheme than the CRT.

      3 replies →

    • maintaining backwards compatibility means maintaining silly decisions, and Microsoft does both.

  • The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.

    Sadly such set includes loads of Java programs. If only SUN had shipped a standard way to generate isolated exe files in 1998... but they worked under the presumption that you'd have a JVM already there, because distributing that monster was difficult in dialup times, so you could just hand people a jar; and the enterprise market did not care, since they had webapp servers. Sadly it's an "optimization" that became obsolete very quickly but wasn't rectified until it was too late (java 9+).

    • > The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.

      Actually it isn't, just use double quotes and add a '~'. It's just about the only thing batch files handle better than shell scripts. set "VARIABLE=%~PATH"

  • That annoys me every time I use a Windows system. It was a terrible decision, especially since both the command prompt and the new powershell doesn't accept like bash a backspace before a space, you have to quote the whole path! I get that most users on Windows don't use the shell, but as a developer I do a lot, and every time it's a pain (no wonder they added the WSL in Windows after the failure of Powershell...)

    • Why would they accept a backslash? Backslash is a path separator on Windows. In most Windows programs, you don't even need to escape the space - arguments can contain spaces and it will understand it, like `notepad My file.txt`

      The escape character on PowerShell is backtick, and on cmd it is caret. You don't need to quote everything.

  • VFAT and stuff like that actually provided alternate names like PROGRA~1

    • Yes, I was doing code to quickly read FAT folders (on a micro controller) and got to the bit about filenames more than 8.3. I decided my life was too short (and processing time) to go and sort out what the "real" file name is. Enforced 8.3 as a requirement!

  • They may have thought that would happen but I saw just as much stuff end up in C:\Windows or \Users or (always my favorite) those “Documents” that are really just “whatever random crap every app wants to put there”.

  • Yet in Microsofts own cmd tool I need to put quotes around my path if I want to refer to any files/folders below those folders.

Still way too many libraries and programs can't handle spaces in filenames.

And shells and other programs still have problems with perfectly legal characters in filenames too, like '!' or ':'.

  • Was recently encoding my Stargate: SG-1 DVDs to move them to plex. I was encoding it on a system other than what was serving it, so I had to copy it. It's surprisingly difficult to "scp" a file with a colon in it directly.

    I also love when you're using bash and you have a file with ! in the name, and you accidentally fail to correctly backslash it, you not only get "bash: !rest_of_filename: event not found", but it also fails to add that command line to the history, so you can't just hit up and fix it. You have to actually go to the mouse and copy and paste.

    • That sounds like... Puzzle time! I had to cheat, sort of, by looking at the man page:

      > Local file names can be made explicit using absolute or relative pathnames to avoid scp treating file names containing ':' as host specifiers.

      So `scp foo:bar user@host:~` fails because it tries to find the host foo. But `scp ./foo:bar user@host:~` works just fine. I feel kind of stupid for not guessing as much.

    • Can't you usually just put quotes around the filename and/or path to prevent all those issues?

      Edit: nope, just tried it and scp still sees the quoted filename as a host + path

      1 reply →

  • If you suspect that the file might be handed to a bash script at any point, being afraid of spaces is very healthy for sure.

  • > And shells and other programs still have problems with perfectly legal characters in filenames too, like '!' or ':'.

    Without asking you to always quote and escape every file name - what alternative is there? If they tried this you'd probably find you didn't like it.

    • Not exactly - the problem is mostly when doing variable expansion. The fact that bash treats "$x" and $x as different is a bit of a design flaw. Of course there's still an issue with evaluating dynamically generated code, but that problem is partly solved by working with arrays.

      2 replies →

  • Colons are a problem on Windows, so it's reasonable to discourage creating files with colons in the name.

  • > Still way too many libraries and programs can't handle spaces in filenames.

    "It's nothing."

    "What do you mean?"

    "It's nothing... It's empty space. I never taught the computer how to read empty space!"

    "I never taught Virgil how to fly."

I'm not young, but I've been using Macintosh computers regularly since 1990, and even back then file names could be up to 31 characters long, and could include any character except colon.¹ So I'm pretty comfortable using spaces, and sometimes even non-ASCII characters, in file names.

Also back then Mac file names typically did not include an extension, because the file's type was stored as part of the metadata in its resource fork. I remember one time a friend of mine was visiting and was playing around with a paint program on my Mac. Being used to DOS, when she went to save her file, she typed a very short name, and then asked me what the proper file extension should be. I smirked and said, "That's not how you name files on a Mac. THIS is how you name files on a Mac." And then I named her file "Ailsa's Cool Picture". Her mind was blown. :-)

¹This is because the colon was the path separator. But since the classic Mac OS had no command line interface, the typical user would never type or even see a file path written out.

  • All of that was very cool and impressive and extremely user-friendly.

    However, I found the lack of a command-line to be restricting.

    • On the other hand Mac had some great GUI programs.

      Sometimes i think that the command-line is a crutch that keeps programmers from learning how to make good UIs.

      1 reply →

Well, you should still be afraid! Be very afraid! Seriously: only a few months ago I was confronted with a video encoding tool that didn't work properly when the file names contained spaces - so yes, even in 2021 it's still safer not to use spaces in file names...

Looks like I'm in the minority. I always use spaces and non-ASCII characters in filenames.

In many languages it's a requirement. For example, in Romanian, there are 8 words that collide with „fata“ if you remove the diacritics (fata, fată, fața, față, făta, făță, fâța, fâță).

Given that we have to use diacritics, spaces don't seem like a big deal.

  • > Given that we have to use diacritics, spaces don't seem like a big deal.

    There is one big difference: CLI utilities don't usually care about diacritics (though encoding issues can throw a wrench in that), but they care a lot about spaces. So putting spaces in filenames requires properly quoting or escaping parameters, whereas diacritics does not. That makes one-off shell snippets and scripts a lot more annoying (though TBH I tend to shy away from those anyway, these days).

  • So do I. I have a language, and I'm not afraid to use it. My computer should speak it just as well as I do.

    • There's a server at work that name with a non-ascii character. I've run into compatibility issues lots of times where I can't connect. I prefer to just use English with ASCII and be happy

      2 replies →

  • We have a few words that depend on diacritics to be unique in Czech as well - though not as bad as this example - but people just manage without. Hell, I don't even bother installing the Czech keyboard, if I REALLY need it (like in names), I just google for words that have the character and copy it

  • So how did you deal with it in the 80s/90s?

    • Not sure about Romanian, but for many other languages people essentially came up with transliteration schemes (multiple, incompatible, ambiguous) to squeeze your language into ascii.

      The resulting text was understandable by the "computer people" but not the general population who did not use the networks back then, perhaps somewhat comparable to when some time ago USA parents encountered the "SMS slang" used by their teenagers.

    • As you would assume: use ASCII and deduce from context. Many people still do that.

      That has lead to phantom diacritics: reading letters in unfamiliar words/names based on what you assume they are. For example some pronounce Chirica as Chirică because they assume someone forgot to type the breve in ă.

      1 reply →

    • Back in the day there were dozens of character sets that were alternatives to US-ASCII. Having once worked on an Email client, I needed to bake in a bunch of translation tables to convert stuff sent that way into UTF-8.

  • >In many languages it's a requirement. For example, in Romanian, there are 8 words that collide with „fata“ if you remove the diacritics

    That is what context is for.

I never put spaces, and won't go over 32 characters, preferably less than than 16. even when sending a file to my grand mom. that's how deep rooted the trauma is. and yes, it remains an issue with some parsers and what not.

  • I still find files on the internet that my browser can't download because too many characters :(.

    Edit: can't save, downloading works.

    • This is a Windows-only issue AFAIK. It's the same reason why people decide to put their projects in something like C:\dev

      Apparently it's quite easy to reach the 260 chars limit

      6 replies →

If I'm going to use the file in the command line, I won't use spaces, since I don't know what sick bug I might encounter.

  • Same. For documents and stuff that I use in normiespace I give them friendly names with capitalization and spaces and such, but for anything I'm going to be working on via CLI I try to use filenames that will be easily chunked as "words" when doing things like double clicking it in terminal to select, ^w to erase it, tab completion etc.

Somehow the OneDrive clients still refuse to allow leading or trailing spaces in the filenames, along with a few other characters that are not allowed - seems to cause quite a bit of user friction at least with the non-tech guys that I work with who are confused about why OneDrive is one of the few file syncing clients that has these requirements....

  • Gdrive the same "issue". I think it's on purpose to avoid files that seems to have exactly the same name.

    This can cause user confusion

  • I have had to deal with that nightmare multiple times this year! It was a real head scratcher at first.

This is a UI/UX problem that I only face when dealing with shells and shell scripts. Never had any issues when spawning processes from within languages/runtimes that support sane argument arrays.

sh, bash and cmd.exe are shit. The shell needs serious rethinking.

  • This is a difference between $@ and "$@" (note the quotes):

      $ cat proba.sh 
      #!/bin/sh
      echo "Using quotes:"
      for i in "$@"; do echo "$i"; done
      echo "No quotes:"
      for i in $@; do echo "$i"; done
      $ ./proba.sh "ho ho ho"
      Using quotes:
      ho ho ho
      No quotes:
      ho
      ho
      ho

  • I see that there are lots of comments about problems of TAB-completions with filenames with spaces in this comment section and I am frankly puzzled: both Bash and cmd.exe actually TAB-complete those perfectly fine, inserting quoting where it's needed.

    • And where it isn't needed. If you have a path that contains a variable and a space, bash will happily escape the $, making the path invalid. See the following:

        $ cd $HOME
        $ mkdir my\ dir
        $ ls my[tab]
        $ cd /
        $ ls $HOME/my[tab]
        ls: cannot access '$HOME/my dir/': No such file or directory
      

      That error is because when you press [tab], bash changed the path to \$HOME/my\ dir/ but that isn't obvious from the output and I couldn't find a proper way to include the tab-expanded result in the transcript.

      (edit: this is on GNU bash, version 4.3.48(1)-release but I've seen this behaviour for years)

      2 replies →

    • I seem to remember bash losing preferred escaping when TAB-completing, but can't reproduce it now with 5.0.17.

      Eg. you'd type `ls -l "Spaced [TAB]` and it would turn it into `ls -l Spaced\ Name`. I remember similar annoyances with other special shell characters (eg. single quotes, dollars, slashes), but that all seems to behave sane now.

      2 replies →

    • > inserting quoting where it's needed

      You have to remind yourself to do this manually in scripts if you don't want to see lines full of "No such file or directory."

      One of the reasons the shell is broken is because the character they use as an argument array member separator is something that regular people use to distinguish between two words, such as in a file name.

      2 replies →

Posix makefiles don't support spaces in dependency names. Not sure about gmake.

Cmake doesn't support semicolons, because everything in cmake is a string, and ; is the list item separator.

PATH is separated by colons, so you can't add directories containing : to it.

Every week, I encounter a user - just like I did in the 80's - who cannot explain the difference between a file and a folder.

"What do I use a folder for?", they ask, in the same breath that they request "some way to organize things logically".

The no-filesystem movement has worked hard to eradicate this scourge from user experiences, but I fear that this is the devils work. Computer users should know what a file is, and what its for - and they should know what a folder is for, and why they would want to create one to put their files into it ..

But yet: they don't.

It hasn't improved since the 80's. Taking away the users responsibility to understand these things, only makes computing worse. The fact that "special chars in paths" breaks things, also holds this factor into place, imho.

  • > The no-filesystem movement

    Is that the movement to store all your data as an amorphous pile of crap, and then provide easy-to-use search tools to actually find the content you're looking for?

    On one hand, I really like the search tools that come from this. But I still like to actually organize my data, so I can browse it if I want to. Also, these search tools seem to only work well enough on macOS and fall flat on their face in Windows. (and no idea where Linux falls on this)

    • You had me at "amorphous pile of crap", but lost me at 'actually find the content'... ;)

      Meanwhile, I've got a single directory full of PDF files (over 60,000+) which I routinely "ls -alF | grep <search term>" for, and I've also got some PyPDF scripts for doing deeper content search - but yet I yearn for a way to automatically parse the filenames and organize things categorically into a folder tree resembling a word cloud, symbolic links and all .. one of these days ..

You think space are bad (and yes I'm old enough that I don't use them)... We work with a company that has forward slashes "/" in their trading name and insist on shared cloud directories involving them to be prefixed with that trading name.

As you as you do anything programmatic in/out of these drives it all hits the fan. So I'd add to the original statement - "Avoid 'technical' companies with special characters in their name", it's just not right...

If putting spaces in file names makes you queasy, try punctuation - especially punctuation like semicolon or ampersand or single quote that's meaningful to shells and such. <shudder>

Also, emoji.

  • Or for more fun, use language specific characters, like äöüß...

    And even more fun is, when it mostly works, but then it doesn't and you notice too late.

Honestly, this still causes a lot of problems with some Software. I've had friends asking for help with obscure errors that were ultimately caused by the files they were using being on a path that contains a space or special character.

Shells are indeed the main culprits for the continued fear of spaces, but not the only ones. A lot of programs that deal with "metadata" which will then generate database tables and stuff like that, still struggle when working with any sort of special character. And the same for anything that, behind the scenes, just feeds text into regexes.

  • Just this weekend I learned that the Espressif Framework doesn't like it aswell.

Our local development environment has evolved to a complex enough sequence of steps to set up and troubleshoot that I spent 2 weeks creating tooling that you can simply point at source checkout locations and the tool will take care to setup that repo.

It broke on the first try on a jr hire's machine, the source checkout location was `C:\source code`.

Slightly off topic but I find myself stuck at being "please for the love of god don't use spaces in git branch names" old. Anno dazumal this might not even have been an issue and I'm just cargo culting.

  • And on that topic, git branches are case sensitive but windows filesystem API isn't. Git branches are materialized on the filesystem as files and directories.

    • The Windows filesystem API supports CS file- and directory names just fine.

      It can be enabled on a per-directory basis like so:

      > fsutil.exe file setCaseSensitiveInfo C:\folder enable

      NTFS had support for this for decades now - it was designed that way to be POSIX-compliant.

      It's shoddy software that lacks support for it, not the OS or the file system.

    • If people actually abuse git branches being CS, odds are good they're also abusing CS in the repository content.

      The linux kernel is one of the offenders, if you check it out on Windows or macOS (which supports CS but remains CI by default) you'll immediately get garbage in netfilter, because it's an habitual user of having different files with names identical but for the casing e.g. xt_TCPMSS.h and xt_tcpmss.h.

  • I enjoy choosing fun branch names from time to time. A few of them: Russian when a user reported a typo in a Russian translation; emoji (mostly added emoji rather than pure emoji); and my personal favourite, a ~250 character diatribe about a single-character bug I was fixing (~250 after I discovered that Git’s error messages when you cause it to try to use file names too long for the file system are fairly mediocre).

Spaces breaking tab completion is still an issue, so, yeah.

ETA: not broken in a technical sense, but having to escape them isn't the best experience. So it's just easier for me to avoid spaces.

  • Where? It works fine in bash and I think most shells ….

    • That was a bit of hyperbole on my end, my bad. But you do have to escape the space, which I'm counting as a minor break.

Where I used to work they had a risk system that created directories on the window server that matched the book name. They had a trader that named one of his books "COM1"...

I saw this and felt old, but then the comments in here made me realize that the fear\ is%20real.

I still find them annoying, doing lots of work on the command line. I use this hack:

  #!/usr/local/bin/sbcl --script
  (load "~/.sbclrc2")
  (require 'replace-all)
  (in-package :replace-all)


  (format t "file is ~s" (second sb-ext:*posix-argv*) (probe-file (second sb-ext:*posix-argv*)))
  (let* ((args sb-ext:*posix-argv*)
    (orig (second args) )
    (newfn (if orig
      (replace-all orig "(" "-") 
      orig))
    (newfn1 (replace-all newfn ")" "_"))
    (newfn2 (replace-all newfn1 " " "-"))
    (newfn3 (replace-all newfn2 "&" "-"))
    (newfn4 (replace-all newfn3 ":" "-")))
    (when orig
 (format t "renaming \"~a\" to \"~a\"~%" orig newfn4)
 (multiple-value-bind (new-name old-truename true-newname)
  (rename-file orig newfn4)
   (format nil "new-name ~a old-true ~a new true ~a" new-name old-truename true-newname))))

Spaces in file names break half of the shell scripts I have encountered.

And it is one of the biggest reason I hate Unix shells as programming languages, it is a minefield. In fact I think that after a dozen lines, Perl is a better option. It has most of what shells are good at (i.e. running commands), but saner and more powerful.

  • my god, I was simply trying to loop over every file in a dir and zip it in a bash one liner. Of course, some of the inputs had spaces in the file names. What an exercise in frustration!!!

I had a guy in my team use forward slashes in filenames. Terrible idea, caused all sorts of weird issues.

It's not a matter of being afraid, spaces in filenames are annoying.

I mostly use the shell and navigating in directories with spaces is annoying, you have either to quote it or put a \ before each space. You also have to remember to quote everything, and in bash that can become complex, you start adding quotes everywhere to solve problems caused by spaces (or other special characters like *) in filenames.

So I prefer to not use them, a simple _ is as readable as a space. Only thing is that spaces gets rendered better on graphical file managers, but... that could have been solved (and can still be solved) by simply adding an option to render a _ as a space graphically if there is no ambiguity. I don't care that much since I don't use graphical file managers that much.

Maybe it's just me, but it always seemed like prohibiting spaces and other special characters was a reasonable way to avoid unnecessary complexity (and the bugs that accompany it) when parsing and navigating directory trees and files.

I'm old enough to remember working with 8.3 filenames in DOS, and while the length limitation was maddening, the space part never was. Then Windows 95 came out and all restrictions were thrown out.

Why couldn't we just have a file system that robustly supports long filenames, including variable length extensions, while prohibiting certain special characters - namely spaces, slashes or any directory denoting characters in files, and characters that have special meaning in regex context? (brackets, asterisk, etc.)

  • By coincidence, I found another reason just two days ago. A web app lists uploaded files’ names, and (in a rarely used context) lets the user search for them. One user has copied a file name from the web page, and pasted it into the search box, but got no results. Turned out that the file name contained two consecutive spaces, which the browser turns into a single space, hence no match. Every layer between the user and file system can do something unexpected.

I nearly gave up on learning newer front-end JavaScript stuff like React & webpack and so on a few years ago because of spaces in paths.

node-gyp doesn't like it when there's a space anywhere in your working path. Stuff I was messing around with was all in ~/Code Projects at the time, and using npm install on some things just broke. Looking back, I definitely could have done a better job parsing the error messages but still...

There's an issue but it was closed in 2018 as "The workaround is to use a path without blanks" https://github.com/nodejs/node-gyp/issues/439

Tangentially, I frequently add dates to filenames to keep things organized. And _always_ in the `YYYYMMDD` format for clarity and technical reasons; `DDMMYYYY` (or God forbid the Americans' `MMDDYYYY`) never made much sense to me.

  • I do this so often that I have an emacs macro or two that helps me out:

      (defun mdy ()
        (interactive)
        (insert (format-time-string "%04Y-%02m-%02d")))
    

    That inserts the "proper" date format (e.g., 2021-11-11) at the current point.

    Then to create a date-stamped file name:

      (defun file-mdy (file-name)
        (interactive "sbasename: ")
        (find-file (format "%s-%s.org" (format-time-string "%04Y-%02m-%02d") file-name))
        (save-buffer))
    

    And a few others.

    Nobody seems to misunderstand this date format. US folks might find it annoying, but understand what it means.

I have had a huge music library on my RAID, and naturally it had a lot of spaces, and non-ASCII, in the file names.

It's cumbersome-ish, but can be made to work.

Then there's shell injection via files containing a newline character in their name...

Spaces in filenames were a mistake to begin with.

Spaces are used to separate parameters in the command line. There's also no real need for filenames to support spaces.

  • Or, one could claim that the poor parsing of a text interface shouldn't dictate the for-human names of files, especially when an exceedingly small percentage of users deal with that text interface.

    But, of course, if you mix the abstractions of metadata (filename) with location, things won't be trivial.

  • The filename belongs to the user. Therefore, it is incumbent on the computer to adapt, not the other way around.

Even if libraries all handled it, I’d still personally avoid spaces because spaces get semantically used to separate tokens and I see file names as tokens.

Spaces in file names are a poor idea. File names are identifiers, not titles.

Let's test something: http://example.com/my silly webpage.html.

Hey look, HackerNews just broke a URL with spaces in it. And it's written in a Lisp dialect and all; it's not some Unix job cobbed together with shell, sed and awk. The language has a string data type, and strings are passed to functions without word-breaking interpolations taking place.

You know what else breaks on spaces? Basic everyday gui text manipulation.

Suppose that in a block of text we have the sentence:

> Please look for the Holiday Schedule 2021 file.

If you double click on any part of the name like Schedule, pretty much every text widget on the planet will just select only that word, and not the entire filename.

However, if you have:

> Please look for the holiday-schedule-2021 file.

There is at least a ghost of a chance that a semi-intelligent GUI can pick that out as a word.

There exist good reasons to keep identifiers as clump beyond just command line shells.

It's why we need encoding like %20 in URLs that never pass through a shell script.

I don't use spaces, because I want to be able to run ad-hoc shell one-liners when working with my data without worrying about quotation and similar stuff.

I also don't use :, as I have ran into problems with both Bash and its completion and FAT FS. Unfortunately, I routinely have timestamps in filenames, so I need to use +%F-%H-%M-%S instead of simple +%F-%T.

One thing has improved, though: I have not run into problems with ěščřžýáíé (which my language is full of) for maybe a decade, except on OpenWRT where space seems to be scarce to support non-ascii.

Edit: I now remember one problem, getting images for a website from an OS X user, which used combining characters instead of direct code points (https://en.wikipedia.org/wiki/Unicode_equivalence#Example), but HTTP requests got normalized in some browsers, leading to strange 404s.

That's funny because the first operating system I used (Apple DOS 3.3) was very liberal about file names. There was a 30-character limit which was a lot, and it didn't mind spaces in file names. Even control characters were fair game, which made things fun when you accidentally inserted a ^A in a SAVE command.

File names shouldn't have anything except a-z,0-9,_ and perhaps a -. No unicode, no spaces, no nulls.

It's not fear that keeps me from using spaces in file names, it's habit.

If we're going to play this dangerous game, from now on I'll figure out how to use nulls (\0) in my file names, and make all the C/C++ programmers cry.

I don't use spaces because it's so much faster to type filenames out (including with TAB-completion) in the terminal.

I do, however, use Cyrillic (UTF-8) in filenames, and I regularly try out if moving a file into ASCII-path will let some programs open it (half the time it's that when I am having trouble).

It's just such a pain in the butt to work with files with spaces. In a script it's fine b/c I just surround it in double quotes, but on the command line I hate having to escape the spaces.

This might already exist, but I wonder about a terminal that was really just a multi-line repl to a language. It would be preloaded with libraries that replicated all the features of the gnu core utils, but instead of calling grep like normal, you called a function like grep("args"). The advantage would be that you had access to a full blown programming language at all times. So when you needed to do something more complicated you would still have access to all the standard language features. And when you didn't need that, your canned core utils like functions would work

Coming from web-heavy and perl5 backgrounds, it's insane to me that people don't treat filenames and arguments and environment variables as tainted user input, and just blindly trust properties about them like "does not contain whitespace or control characters".

I had to move my development folders because you can't develop Android apps if your project path contains a space. Not sure where the issue is, if it's gradle or something else.

Edit: thinking about it again, it might not have even been the space but the exclamation mark in my path. Or both.

If any of you reading this have to deal with very large scale data pipelines for data science / ML type processing, and if "don't use spaces and weird chars in file names" hasn't become second nature by now, let me just say: you are very, very brave.

My first job as a SW Eng was in 1989 in the nuclear industry. Our folders and files were limited to 8 letters. So names were effectively acronyms. It was actually pretty awesome. Clean and concise. Years later, I still remembered the whole folder structure.

If you're in tech long enough, you can be traumatized by anything. Like the time a vendor-supplied system decided after an update that nothing could have a hyphen in the title, and a lot of existing content just... broke at once. Fun times.

Spaces in file names are a nightmare in Makefiles.

  • Not if you are careful (a bit like "$@" vs $@ in shell scripts).

    Edit: replace $@ with quoted version which actually changes the behavior (I was wrong that the difference is between $* and $@).

    • I don’t think it’s fair to claim that any Make implementation supports spaces: there are too many fundamental bugs and breakages, so that lots of rather important Make functionality is off-limits if any of your file names will have spaces.

      https://www.cmcrossroads.com/article/gnu-make-meets-file-nam... explains the situation in GNU Make in 2007 (and I don’t think it’s changed since then, though jgrahamc especially could correct me). Not being able to use such features as $^ and $(patsubst) is severely debilitating for all but the simplest of makefiles.

      1 reply →

Not exactly spaces, but I have been bitten by something like this at my work quite recently. A Confluence page with special characters in the page title was working fine for a while. At some point there was a Confluence version update which made the page URL broken (and apparently unrecoverable, or at least not easily recoverable).

One way to look at it is that people of a certain generation eschew spaces because the tools of their formative years simply couldn't handle spaces - but another is that the olds have learned that generally erring on the side of KISS ("Keep it simple, stupid!") isn't a bad idea.

Software engineers - particularly of the more embedded variety - absolutely still have this problem.

The main culprit is GNU Make which does not cope with spaces in filenames. As far as it is concerned an array is a string separated by spaces so it gets very confused. Yes there are some partial workarounds, no none of them consistently work. You learn very quickly to check all code out in a file tree with no spaces in it, otherwise builds can randomly break in strange ways. It's not always clear up front whether Make is going to be involved somewhere in the build, so it's just easier to be safe.

My username has been my name which has an accented character and has broken countless Windows apps every year since forever, so I just keep a C:/Programs folder where I run stuff. You should never not fear filenames.

  • I use c:\programs too, but for different reasons. C:\Programs is for portable applications that don't get installed, can be directly overwritten, and consist of at most two files with relevant names. As a bonus, I can run such programs directly from the run menu. C:\Programs\procexp for Process Explorer, for example.

  • I am overly aggressive with spaces and special characters in filenames: I use them everywhere and report a bug when they cause errors, because they shouldn't in this UTF-8 age.

    I still don't use the special character of my name in my username because that has caused me many hard to fix troubles. Think "cannot recover user password because this user doesn't exist".

I recently find out a windows folder can't end by a space.. But python for example you can create this folder 'example ' every file you create in this folder will be inaccessible, and impossible to delete.

I've never created a filesystem entry name with a space. Mainly because fear and when fear is not proven, "\" looks so ugly. But I think I'm even worse, I dislike capital letters too.

Nothing old about that; lots of stuff is still broken. What are the odds Homebrew works if installed to a directory with a space in the name? Maybe the core brew manager itself, but all the packages?

I tend to follow a Postel-like system when it comes to this. When I write a script I'll usually get paranoid and make at least token efforts to handle spaces. Which I will then never, ever use.

I have come back to this thread, which I have spotted and forgotten something like two days ago, to say that just like a minute ago one of new Jenkins jobs that I added failed because I named the item using space and some custom Gradle/Maven magic tool failed to load one of its own auto generated files (I could tell that space was the culprit because error message printed only second half of item name).

How can I not be afraid of spaces if this happens like every other day with every other custom tool ...

Let me tell you how much of a pain in the ass that my employer forces spaces in the corporate OneDrive directory.

PS-Microsoft is horrible about stupidly named folders being created and dumped in there.

  • Depending on the specific issue, the `subst` command may help you. If the OneDrive folder itself has a space in the path, or a necessary subfolder does, you can give that folder a drive letter instead.

And honestly, it's a good fear to have; there are contexts where it still just doesn't work.

Last I checked, the standard answer for GNU make is "Spaces are expected to break the tool, that's working as intended, it will never be fixed." And because we build our towering edifices of software on the pillars of the past, I can't guarantee to you that a project of arbitrary complexity won't try to cram a list of filenames through a make script.

I don’t think this is so much an age thing as a programmer thing. Old people will still name files all sorts of things, and a lot of young programmers today avoid spaces.

If you're developing on Windows, I find a good way of dealing with this to convert paths to short format before using them (E.G. GetShortPathName in kernel32.dll).

Not afraid, but typing a dash in the terminal is easier and shorter than typing a reverse slash and a space. Spaces are kind of a pain in the ass in the terminal, tbh.

  • Quotes around the path is easier and avoids any issues - but tab completion and drag and drop files into terminal handles most cases for me.

This seems like a case for an axiom I hear infrequently, but I think comes up a lot - things that seem like they should be simple and easy, but are in fact difficult.

I must be nightmare customer, because I've always been exploiting my ability to use filenames in full UTF-8. I'm that guy that sends .pdf to your website.

Why stop here? Why not put spaces in your variable names also? Allowing spaces only in file names and not in variable names is short-sighted when not inconsistent.

My proposal for a shell on the Mac, in the late 80s, was:

- Spaces in filenames get transformed to non-breaking spaces by the filesystem;

- The filesystem treats nbsp as equal to space (just as case-folding treats A=a, B=b, etc.)

Now, argument parsing, mouse double-clicks, etc. all respect filenames as "words", and the output from things like 'ls' just work.

(Yes, I'm well aware that there are case-sensitive filesystems out there. I'd forgotten that iOS was one of those).

As a software engineer, I require testing of paths and files in spaces, and forbid the use of spaces for any system generated file possible to make cli easier.

I do it the other way around. I used to be afraid of spaces. But I have come to realize that it is better to learn sooner than later which pieces of software is in such a bad state that they aren’t handling spaces correctly.

That being said, even after all these years I sometimes need to try a few times in order to get the quoting and the escapes right when communicating names of files with spaces through multiple layers of software.

I like to store data on USB flash drives. After being left to mature for a few years in a humidity and temperature environment, you get some really interesting and complex byte streams where your original file names used to be.

Often they are not even valid UTF8 which, when you uncork the filesystem for the first time in a decade causes the most delightful crashes. The more years the better the aroma.

I'm »still tempted to write umlauts like 'Mot"orhead' old.«

But also a "use a font that has a proper capital ß" hipster.

I'm hoping to one day be "Windows adds user root folder to the quick links in explorer by default" years old.

I always format my filesystems (macOS) as case sensitive and I'm surprised by the software that has a hard time with that.

On Unix/Linux we've grown up with case sensitive by default but everywhere else it still seems to be a problem now and again.

I should qualify this...I'm en-US so I have no idea what the experience is like for anyone else.

You need them for URL's. Running a stand-alone web page maker using Rust. Document structure:

    [Introduction](./Introduction.md)\\
    [Chapter One](./chapter one.md)\\

Crashed on trying to deal with building html when there are spaces in the file name. It is still an issue.

Today, WSL will try to add PATH in Windows to PATH in Linux. So if you install something like NodeJS in Windows, and run node in Linux, it will try to call /mnt/c/Program Files/nodejs/node.exe and say "no such file or directory: /mnt/c/Program".

I had half a feeling that the warning against using spaces in names pre-dates computing, but after a little research into library call numbers and archive accession numbers, which turn out to have both historically included spaces, I have found no evidence to support this feeling.

It seems to me that many of the problems associated with spaces in filenames are due the OS assuming that a space signals the end of a command or filename.

Maybe we ought have to a different character signify the end of a name? Or signfiy a option section, or the next option section of a command?

In the shell spaces have to be escaped which is annoying. This doesn't change with age I think

And I'm older than Google. If you want some hilarity, newlines are allowed in filenames as well (\n, \r, \r\n). Try getting bash to handle that! (It's possible, though annoying. try redirecting to `while read line` in addition to xargs -print0 hackery)

I've never had any problems with this. At this point, it's second nature for me to either use underscores for spaces, or camel caps if there aren't any single character words like 'i' or 'a' in my desired file name.

Yes, but working with filenames with spaces in them is a huge PITA in command-line tools, because you have to quote everything. The ergonomics is just really annoying.

Personally I wish console shells had chosen another delimiter than space, but here we are.

Not obeying the "Robustness Principle" in software is just poor engineering.

https://en.wikipedia.org/wiki/Robustness_principle

  • Definitely applicable here. There's no way we're going to eliminate all problems with spaces etc, so why invite trouble.

    I wouldn't say it's always poor engineering though, especially the 'liberal in what you accept' half.

    • Yes, you have a point there, but in this case would being liberal in what you accept be to accept filenames with spaces or (arguably) doing filename handling correctly (ie accept filenames with spaces)?

I'm apparently in the minority of people who know how to write shell scripts that have a chance of working correctly with filenames with spaces in them... and that's not the only reason I avoid spaces in filenames. :)

I have experienced a person using a space in a password for Windows login.

I still don't know how to process this emotionally. Either it is somehow naively really genius, or stupid.

In any case, it scares me, mostly because it is a non-IT person.

Reminds me of the time I watched a coworker's head explode when he tried to extract an archive (from a 'Nix environment) on his Windows machine and was indignant about getting duplicate filename errors.

I work in Azure Data Factory, and there are places where a space in a name will cause you difficult to troubleshoot errors. But I can never remember where. It's not universal. So I just avoid them entirely.

So, born today, eh?—says the guy who still regularly runs into build scripts that cheerily command that they be run from directories without spaces, since that's easier than proper quoting in the script.

The meta point here is that spaces are the type of thing that work fine ... until they don't. This class of bug is best avoided entirely, especially if there is an easy workaround (not using spaces).

But it still breaks in so many situations and becomes a pain in the ass in so many other ones! I HATE people who use spaces in file names. For me it is a sign of a "deeply nontechnical person".

Oh, yeah. Me too!

Except nowadays I worry more about user names that get fed into collaborating applications (with different edit criteria) and password characters (again for systems with differing, strange edit rules.)

I name almost everything with underlines still. I think it’s a programming habit.

Although lately I have started saving my Logic Pro files with spaces, simply because I prefer it to be the name of the song as-is.

I'm “still afraid to use spaces in file names” wise, dammit!

  • I would say I'm "wise enough to not use spaces in filenames".

    It's not about fear, it's about making good decisions, and avoiding unnecessary complication.

No way I would put anything but a-z, 0-9, and underscore in any file name. Too many stupid ways it can go wrong. I guess I have very little trust in my fellow programmers!

Spaces in path are a pain for the shell autocompletion, since you have to escape them by using either "" for the whole string or use the "\ " instead.

Me too. Afraid of dashes too as they might be interpreted as minus. I use a lot of underscores __ _____ _ _ _

Weirdly, my friend hates underscores. But he's a baseball fan

I know I can put spaces in file names, but \ is one of the characters I still can't touch type, so I still hate dealing with them in the terminal.

ascii, no spaces for me

i still get issues with old one-off scripts, that still work, and I forgot to properly quote stuff... plus the urls are pain in the ass with the %20;s.

I wonder why "space" wasn't always simply treated as another character. To save a couple bytes back in the 50s (when it mattered) I assume?

Any shell script that uses files should use double quotes for at least the variables: `mv $1 $2` is not safe, should be `mv "$1" "$2"`

I'm 19 now and learned this advice from my dad growing up. Still run into situations in my IT work and programming stuff where it makes a difference.

Our tool has no issues with spaces in fields, but we still advise users not to do it because other systems OFTEN STILL DO, in the year of our lord 2021.

I try to avoid spaces and special characters because issues still happen to this day (just yesterday, I had an issue with a file with an accent in it).

My coworkers still don't quote strings in their bash scripts, even when they're paths... and yet they wonder why everything falls apart.

There was a Discussion yesterday at work about allowing quotation marks and semicolons in some user-set titles. We use Mongo. But I empathize.

I'm not "afraid" of it, I just think it's unnecessarily compicated to work with spaces in filenames on the command line.

You should be still afraid. Many commands such as Unix "xargs" don't work properly with spaces if the right flag is omitted.

Why stop at spaces?

An old prof of mine used to send emails where the subject line was always a valid identifier in C.

Hello_dear_students_where_are_your_reports_

  • That identifier is clearly too long.

    MISRA C:2004, 5.1 - Identifiers (internal and external) shall not rely on the significance of more than 31 character.

All because we use programmatically interfaces that were intended for humans to write: command line, sql, html, email headers.

  • It's worse than that. Whitespace is a hellish invention in the world of computers: there are multiple characters that may or may not render as whitespace with no way to distinguish them by just looking at the output.

    Yet to the machine (script, shell, program, ...) it matters a lot, since u0020≠u0009≠u00A0≠u2000≠u2001, etc. whereas the aforementioned codepoints render like this: " " (and yes, that's indeed the five codepoint in that order - at least I typed them that way).

    (Ab)Using whitespace like that can lead to all sorts of funny business, not just when dealing with shell scripts and variable expansion.

This is why \Program Files, and \Program Files(x86) exist as they do. With spaces, and strange characters, in the name.

Can someone convince me to not use spaces in music, film, and book files where they have a "standard title"?

Some react scripts freaked out on me recently because my login (and thus user folder) in windows contained a space.

I think people who use a terminal interface, regardless of OS, don't like spaces in file names. I avoid them.

I dislike constantly having to backslash escape files on the command line, so I use dashes instead.

Kids these days will say “What’s a file name?” and mean it. Typing? That’s for the olds.

Never use spaces in file names. It shouldn't depend on age, it's common sense.

Sort of related, but here's a joke: Windows 95 does support long filena~1

Anyone else totally fine with spaces in filenames? I use to rip a lot of CDs back in the day, and never had an issue with the spaces in the file names.

01 - Metallica - Metallica - For Whom the Bell Tolls.mp3

Names like that were common, and had many spaces.