Comment by wahern
10 hours ago
I find it easier to understand in terms of the Unix syscall API. `2>&1` literally translates as `dup2(1, 2)`, and indeed that's exactly how it works. In the classic unix shells that's all that happens; in more modern shells there may be some additional internal bookkeeping to remember state. Understanding it as dup2 means it's easier to understand how successive redirections work, though you also have to know that redirection operators are executed left-to-right, and traditionally each operator was executed immediately as it was parsed, left-to-right. The pipe operator works similarly, though it's a combination of fork and dup'ing, with the command being forked off from the shell as a child before processing the remainder of the line.
Though, understanding it this way makes the direction of the angled bracket a little odd; at least for me it's more natural to understand dup2(2, 1) as 2<1, as in make fd 2 a duplicate of fd 1, but in terms of abstract I/O semantics that would be misleading.
This is probably one of the reasons why many find POSIX shell languages to be unpleasant. There are too many syntactical sugars that abstract too much of the underlying mechanisms away, to the level that we don't get it unless someone explains it. Compare this with Lisps, for example. There may be only one branching construct or a looping construct. Yet, they provide more options than regular programming languages using macros. And this fact is not hidden from us. You know that all of them ultimately expand to the limited number of special forms.
The shell syntactical sugars also have some weird gotchas. The &2>&1 question and its answer are a good example of that. You're just trading one complexity (low level knowledge) for another (the long list of syntax rules). Shell languages break the rule of not letting abstractions get in the way of insight and intuitiveness.
I know that people will argue that shell languages are not programming languages, and that terseness is important for the former. And yet, we still have people complaining about it. This is the programmer ego and the sysadmin ego of people clashing with each other. After all, nobody is purely just one of those two.
There must be a law of system design about this, because this happens all the time. Every abstraction creates a class of users who are powerful but fragile.
People who build a system or at least know how it works internally want to simplify their life by building abstractions.
As people come later to use the system with the embedded abstractions, they only know the abstractions but have no idea of the underlying implementations. Those abstractions used to make perfect sense for those with prior knowledge but can also carry subtle bias which makes their use error prone for non initiated users.
make 2>&1 | tee m.log is in my muscle memory, like adding a & at the end of a command to launch a job, or ctrl+z bg when I forget it, or tar cfz (without the minus so that the order is not important). Without this terseness, people would build myriads of personal alias.
This redirection relies on foundational concepts (file descriptors, stdin 0, stdout 1, stderr 2) that need to be well understood when using unix. IMO, this helps to build insight and intuitiveness. A pipe is not magic, it is just a simple operation on file descriptors. Complexity exists (buffering, zombies), but not there.
Another fun consequence of this is that you can initialize otherwise-unset file descriptors this way:
It's a trick you can use if you've got a super chatty script or set of scripts, you want to silence or slurp up all of their output, but you still want to allow some mechanism for printing directly to the terminal.
The danger is that if you don't open it before running the script, you'll get an error:
Interesting. Is this just literally “fun”, or do you see real world use cases?
One of my use-cases previously has been enforcing ultimate or fully trust of a gpg signature.
It was a while ago since I implemented this, but iirc the reason for that was to validate that the key that has signed this is actually trusted, and the signature isn't just cryptographically valid.
You can also redirect specific file descriptors into other commands:
The aws cli has a set of porcelain for s3 access (aws s3) and plumbing commands for lower level access to advanced controls (aws s3api). The plumbing command aws s3api get-object doesn't support stdout natively, so if you need it and want to use it in a pipeline (e.g. pv), you would naively do something like
Unfortunately, aws s3api already prints the API response to stdout, and error messages to stderr, so if you do the above you'll clobber your pipeline with noise, and using /dev/stderr has the same effect on error.
You can, though, do the following:
This will pipe only the object contents to stdout, and the API response to /dev/null.
4 replies →
I have used this in the past when building shell scripts and Makefiles to orchestrate an existing build system:
https://github.com/jez/symbol/blob/master/scaffold/symbol#L1...
The existing build system I did not have control over, and would produce output on stdout/stderr. I wanted my build scripts to be able to only show the output from the build system if building failed (and there might have been multiple build system invocations leading to that failure). I also wanted the second level to be able to log progress messages that were shown to the user immediately on stdout.
It was janky and it's not a project I have a need for anymore, but it was technically a real world use case.
Red hat and other RPM based distributions recommended kickstart scripts use tty3 using a similar method
Multiple levels of logging, all of which you want to capture but not all in the same place.
2 replies →
Yep, there's a strong unifying feel between the Unix api, C, the shell, and also say Perl.
Which is lost when using more modern or languages foreign to Unix.
Python too under the hood, a lot of its core is still from how it started as a quick way to do unixy/C things.
And just like dup2 allows you to duplicate into a brand new file descriptor, shells also allow you to specify bigger numbers so you aren’t restricted to 1 and 2. This can be useful for things like communication between different parts of the same shell script.
Haha, I'm even more confused now. I have no idea what dup is...
There are a couple of ways to figure out.
open a terminal (OSX/Linux) and type:
open a browser window and search for:
Both will bring up the man page for the function call.
To get recursive, you can try:
(the unix is important, otherwise it gives you manly men)
you may also consider gnu info
otherwise it gives you manly men
That's only just after midnight [1][2]
[1] - https://www.youtube.com/watch?v=XEjLoHdbVeE
[2] - https://unix.stackexchange.com/questions/405783/why-does-man...
I find it very intuitive as is
Respectfully, what was the purpose of this comment, really?
And I also disagree, your suggestion is not easier. The & operator is quite intuitive as it is, and conveys the intention.
Perhaps it is intuitive for you based on how you learned it. But their explanation is more intuitive for anyone dealing with low level stuff like POSIX-style embedded programming, low level unix-y C programming, etc, since it ties into what they already know. There is also a limit to how much you can learn about the underlying system and its unseen potential by learning from the abstractions alone.
> Respectfully, what was the purpose of this comment, really?
Judging by its replies alone, not everyone considers it purposeless. And even though I know enough to use shell redirections correctly, I still found that comment insightful. This is why I still prefer human explanations over AI. It often contains information you didn't think you needed. HN is one of the sources of the gradually dwindling supply of such information. That comment is still on-topic. Please don't discourage such habits.