← Back to context

Comment by mananaysiempre

4 years ago

The standard is, indeed, excessively vague because it was written to let many existing implementations be conformant as is, though I’d say it’s still more helpful than many other standards with that deficiency. There’s a method to it, however:

- Things installed in /, if it’s different from /usr, are generally not to be touched;

- Things installed in /usr are under the distro’s purview or otherwise under a package manager, any modifications are on pain of confusing it;

- Things installed in /usr/local are under the admin’s purview and unmanaged one-offs, there are always some but overuse will lead to anarchy;

- Things installed in /opt are for whatever is so foreign and hopeless in not conforming to the usual factoring that you just give up and put it in its own little padded cell (hello, Mathematica);

- Everything is generally configured using files in /etc, possibly with the exception of some of the special snowflakes in /opt; the package manager will put config files meant to be edited there and expect the admin to merge any changes in manually, and sometimes put default settings meant to be overridden by them in /usr/share (see below)—both approaches can be problematic, but the difficulty is with migrating configuration in general, not the FHS as such.

There used to be additional hierarchies like /usr/X11R6, and even a /usr/etc on some (non-Linux?) systems, but AFAIU everyone agrees their existence makes no sense (anymore?), so much that even FHS doesn’t lower itself to permitting them.

The distinction between / and /usr might appear to be pointless as well, and nowadays it might be (some distros symlink them together), but previously (especially before initial ramdisks were widespread) stuff in / was whatever was needed to bring up the system enough that it could netmount a shared /usr.

Inside each of /, /usr and /usr/local there is bin for things that are supposed to be directly executable, whether binary or a script and all in a single place; share and lib for other portable and non-portable (usually but not necessarily text and binary) shared files, respectively, segregated by application or purpose; finally, due to the dominance of C ABIs and APIs on Unices, the top level of lib also hosts C and C++ library files and there’s an additional directory called include for the headers required to use them. Some people also felt that putting auxiliary executables (things like cc1, the first pass of the C compiler) inside lib was awkward so they created libexec for that purpose, but I don’t think the distinction turned out to be particularly useful so not all distros maintain it.

That’s it, basically. There are subtler but logical points (files vs subdiretories in /etc) and things people haven’t found an obviously superior solution for (multilib and cross environments), and I made no attempt to be historically accurate (the original separation of / and /usr happened for intensely silly reasons), but those are the fundamental principles of the system, and I feel it does make sense as a coherent implementation of a particular design. Other designs are possible (separation by application or package not purpose, Plan 9-ish overlays, NixOS’s isolated environments), but that’s a discussion on a different level; the point is that this one is at the very least internally consistent.

Re the unfriendly names ... I honestly don’t know. Newbie-friendliness matters, but it’s not the only thing that does; particularly in a system intended for interactive text-mode use, concise names have a quality of their own. There’s a reason I’m more willing to reach for curl and jq rather than for httpx and lxml, for regular expressions rather than for Parsec, and even for cmd.exe, as miserable as it is, rather than for PowerShell.

I feel weird that no HCI people seem to have seriously considered the tension between interactive and programmatic environments and what the text-mode user’s experience in Unix says about it, but even Tcl, which is in many ways a Bourne shell done right, loses something in casual REPL use when it eliminates (as far as idiomatic libraries are concerned) short switches. Coming up with things like rsync -avz or objdump -Ctsr is not very pleasant initially, but I certainly wouldn’t want to type out the longhand form that would be the only possible one in most programming languages (even if I find their syntax beautiful, e.g. Smalltalk/Self).

>the original separation of / and /usr happened for intensely silly reasons

As I recall, there were very good reasons for separating / and /usr (as well as /home and /var). The biggest one was that various Unix kernels would panic[0] if / was full. But that issue was almost universally fixed by 1990 or so.

And netmounts of pretty much everything other than / were pretty common for many years, due to the high cost of storage.

So no, the reasons weren't silly, they just don't apply to more modern systems.

[0] https://en.wikipedia.org/wiki/Kernel_panic

  • OK, I didn’t put this completely correctly. The original separation of /usr to hold user home directories (!) and / to hold everything else was because the first RK05 disk ran out, but it makes sense in any case. The additional hierarchy under /usr was created some time later when space on the first RK05 disk ran out again, and while this can be a perfectly sensible decision for a single installation on a single site, taking it seriously decades later is silly. Neither does that mean that there weren’t good reasons the split got preserved in subsequent systems, just that they couldn’t have been the same as the original ones; there are no netmounts in V6, after all.

    (I have an old Unix intro book that describes /usr as user home directories, the rest is a second-hand retelling[1].)

    [1] http://lists.busybox.net/pipermail/busybox/2010-December/074...

Thank you for the thoughtful reply, the point about netmounting shared usr makes it much easier to understand.