Comment by collinfunk

14 days ago

Hi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.

I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:

  $ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')
  $ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; done 2>/dev/null
  $ echo a > file
  $ time cp file copy

  real 0m0.010s
  user 0m0.002s
  sys 0m0.003s
  $ time uu_cp file copy

  real 0m12.857s
  user 0m0.064s
  sys 0m12.702s

I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].

Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)

[1] https://www.gnu.org/prep/standards/standards.html#Semantics [2] https://github.com/advisories/GHSA-w9vv-q986-vj7x

Indeed, std::fs suffers from being a lowest common denominator. Rust had to have something at 1.0, and unfortunately it stayed like that.

Rust uutils would be a good place to design a more foolproof replacement for Rust's std::fs API.

  • Unix embodies this, as well.

    When K&R created unix and C there was still the better option of moving changes that were better to have in the "kernel" into the kernel.

    Now we have "standards" that even cause headaches between Linux and BSD's.

    Linux back-propagates stuff like mmap, io_uring, etc. to where it belongs. In this way it is like the original unix. And deservedly running on most servers out there.

First of all, thank you for presenting a succinct take on this viewpoint from the other side of the fence from where I am at.

So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.)

So here goes:

Question 1:

How come "speed", "performance", race conditions and st_ino keep getting brought up?

Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there.

Question 2:

Does throughput really matter more than latency in everyday application?

Question 3 (explanation first, this time):

The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils.

What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO).

  • Coreutils are not only used in interactive contexts. They are the primitives that make up the countless shell scripts which glue systems together. Any edge case will be encountered and the resulting poor performance will impact somebody, somewhere.

    Here's a related example of what happens when you change a shell primitive's behavior - even interactively. Back in the 2000s, Linux distributions started adding color output to the ls command via a default "alias ls=/bin/ls --color=auto". You know: make directories blue, symlinks cyan, executables purple; that kind of thing. Somebody thought it would be a nice user experience upgrade.

    I was working at a NAS (NFS remote box) vendor in tech support. We frequently got calls from folks who had just switched to Linux from Solaris, or had just moved their home directories from local disk to NFS. They would complain that listing a directory with a lot of files would hang. If it came back at all, it would be in minutes or hours! The fix? "unalias ls". Because calling "/bin/ls" would execute a single READDIR (the NFS RPC), which was 1 round-trip to the server and only a few network packets; but calling "/bin/ls --color=auto" would add a STAT call for every single file in the directory to figure out what color it should be - sequentially, one-by-one, confirming the success of each before the next iteration. If you had 30,000 files with a round-trip time of 1ms that's 30 seconds. If you had millions...well, either you waited for hours or you power-cycled the box. (This was eventually fixed with NFSv3's READDIRPLUS.)

    Now I'm sure whomever changed that alias did not intend it, but they caused thousands of people thousands of hours of lost productivity. I was just one guy in one org's tech support group, and I saw at least a dozen such cases, not all of which were lucky enough to land in the queue of somebody who'd already seen the problem.

    So I really appreciate GNU coreutils' commitment to sane behavior even at the edges. If you do systems work long enough, you will ride those edges, and a tool which stays steady in your hand - or script - is invaluable.

  • > Does throughput really matter more than latency in everyday application?

    In my experience latency and throughput are intrinsically linked unless you have the buffer-space to handle the throughput you want. Which you can't guarantee on all the systems where GNU Coreutils run.

    • Higher throughput increases the risk of high latency.

      Low latency increases the risk of "wasted cycles”, i.e. lowers (machine) throughput. Helps with human discovery throughput, though.

      The sled.rs people had a well readable take on this in their performance guide.

  • > Question 2:

    > Does throughput really matter more than latency in everyday application?

    IME as a user, hell yes

    Getting a video I don't mind if it buffers a moment, but once it starts I need all of that data moving to my player as quickly as possible

    OTOH if there's no wait, but the data is restricted (the amount coming to my player is less than the player needs to fully render the images), the video is "unwatchable"

    • I don't mean to nitpick, but absolute values for both of these matter much less than how much it is compared to "enough". As long as the throughput is enough to prevent the video from stuttering, it doesn't matter if the data is moved to your video player program at 1 GB/s or 1 TB/s. Conversely, you say you don't mind if a video buffers for a moment but I'm willing to bet there's some value of "a moment" where it becomes "too long". Nobody is willing to wait an hour buffering before their video starts.

      The perception of speed in using a computer is almost entirely latency driven these days. Compare using `rg` or `git` vs loading up your banking website.

    • Hell no.

      Linux desktop (and the kernel) felt awful for such a long time because everyone was optimizing for server and workstation workloads. Its the reason CachyOS (and before that Linux Zen and.. Licorix?) are a thing.

      For good UX, you heavily prioritize latency over throughput. No one cares if copying a file stalls for a moment or takes 2 seconds longer if that ensures no hitches in alt tabbing, scrolling or mouse movement.

      4 replies →

    • What's every day?

      Exactly, lots of different things.

      When I alt-tab I care about latency.

      When I ssh I care about latency.

      When I download a 25GB game I care about throughput for the download to a certain extent that is probably mainly ISP bound rather than local system bound. I don't care if the download takes 10 or 11 minutes as long as I can still use my system with zero delays meanwhile. And whether it takes 11 minutes of 3 hours depends on my ISP mostly. But being responsive to me while it downloads is local latency bound.

      The Youtube example you have makes sense, sure.

    • This isn't what prioritizing throughput actually looks like in most scenarios.

      In the example you gave the amount of read speed the user needs to keep up with a video is meager and greater read speed is meaningless beyond maintaining a small buffer.

      You in fact notice more if your process is sometimes starved of CPU IO memory was waiting on swap etc. Conversely you would in most cases not notice near so much if the entire thing got slower even much slower if it's meager resources were quickly available to the thing you are doing right now.

  • Just want to point out that race conditions are a correctness problem, not a performance problem.

    • Accurate a.k.a. "correct" implementation of ACID needs a single (central) source of truth and temporal serializability (or something close to that).

      In practice this always "impacts" performance.

      If I understand it correctly, then in physics this is called an event horizon.

      2 replies →

Sorry, complete noob here. Why didn't you just cd into $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')? Why do you need to use the while loop for cd?

EDIT: got it. -bash: cd: a/a/a/....../a/a/: File name too long

  • No need to apologize at all. Doing it in one cd invocation would fail since the file name is longer than PATH_MAX. In that case passing it to a system call would fail with errno set to ENAMETOOLONG.

    You could probably make the loop more efficient, but it works good enough. Also, some shells don't allow you to enter directories that deep entirely. It doesn't work on mksh, for example.

I don't know if you're aware, but there is a demonstration of wget (a fellow "gnu utility", right?) being auto-translated to a memory-safe subset of C++ [1]. Because the translation essentially does a one-for-one substitution of potentially unsafe C elements with safe C++ counterparts that mirror the behavior, the translation should be much less susceptible to the introduction of new bugs and behaviors in the way a rewrite would be.

With a little cleaning-up of the original code, the code translation ends up being fully automatic and so can be used as a build step to produce (slightly slower) memory-safe executables from the original C source.

[1] https://duneroadrunner.github.io/scpp_articles/PoC_autotrans...

  • Filesystem access is mostly treated by users as serialized ACID transactions on "files in directories."

    "Managing this resource centrally" is where unix syscalls came from. An OS kernel can be used like a specialized library for ACID transactions on hardware singletons.

    People then got fancy with virtual memory, interrupts, signals, time-slicing, re-entrancy, thread-safety, and injectivity.

    It doesn’t matter, whether you call the "kernel library" from C, C++, Fortan, BASIC, Golang, bash, Rust, etc.

Probably a dumb question, but is GNU Core utils interested in / planning on doing its own rust rewrite?

  • At the current moment I would be against it. The language and library is changing too fast. Also, Rust has some other things that make it hard to use for coreutils. For example, Rust programs always call signal (SIGPIPE, SIG_IGN) or equivalent code before main(). There is no stable way to get the longstanding behavior of inheriting the signal action from the parent process [1]. This is quite annoying, but not unique to Rust [2].

    [1] https://doc.rust-lang.org/beta/unstable-book/compiler-flags/... [2] https://www.pixelbeat.org/programming/sigpipe_handling.html

    • I think the concern is that the writing may be on the wall for (the current memory-unsafe version of) Coreutils. Despite the bugs and incompatibilities, Canonical seems to have decided that the memory safety of uutils is worth it. And those two downsides, the bugs and incompatibilities, will likely attenuate quickly, compelling the other distros to follow suit in adopting uutils before long.

      So the continued popularity of Coreutils might, I think, depend on Coreutil's near-term publicly announced and actual memory safety strategy. As I suggested in my other comment, there are (somewhat nascent) options for memory safety that do not require a rewrite of the code base. (For linux x86_64 platforms, depending on your requirements, that might include the "fanatically compatible" Fil-C.) And given the high profile of Coreutils, there are likely people willing to work with the Coreutils team to help in the deployment of those memory safety options.

  • Thomas Jefferson famously said that "A coreutils rewrite every now and again is a good thing". Or something like that.

    When I was a beta tester for System Vr2 Unix, I collected as many bug reports as possible from Usenet (I used the name "the shell answer man". Looking back I conclude that arrogance is generally inversely proportional to age) and sent a patch for each one I could verify. Something like 100 patches.

    So if this rust rewrite cleans up some issues, it's a good thing.

  • The rewrite in Rust is mostly vanity and marketing but not based on a real technical need...

    So I don't see why they would want to do that.

    • Canonical's usage of uutils is likely for marketing. But the codebase itself was developed for fun, as an excuse for people to have a hands-on way to learn Rust back before Rust was even released, with a minor justification as being cross-platform. From the original README in 2013:

      Why?

      ----

      Many GNU, linux and other utils are pretty awesome, and obviously some effort has been spent in the past to port them to windows. However those projects are either old, abandonned, hosted on CVS, written in platform-specific C, etc.

      Rust provides a good platform-agnostic way of writing systems utils that are easy to compile anywhere, and this is as good a way as any to try and learn it.

      https://github.com/uutils/coreutils/blob/9653ed81a2fbf393f42...

      9 replies →

    • I thought it was a learning exercise, and maybe some corporations also like it because it has more permissive licensing.

I see even the coreutils maintainers find themselves needing -n (no newlines) and -c (count) options to "yes".

  • GNU coreutils is known for adding command libe options.

    One of the big philosophical differences to the BSD's.

    For a human being, it sucks both ways.

>the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true

That bug got fixed before the Ubuntu release, and is from way before Canonical was even involved with the project.

  • In the given list of GNU CVEs in the original article, it included a buffer overrun in tail from 2021. So for a fair comparison 2021 is part of the "window of activity" (the year uu_od CVE was published).

To be fair, Vec::set_len bug in Rust was in 2021. And even then it had to be annotated as `unsafe`. It was then deprecated and a linter check was added: https://github.com/rust-lang/rust-clippy/issues/7681

  • To be even fair-er, it wasn't actually memory unsafety, it was "just" unsoundness, there was a type, that IF you gave it an io reader implementation that was weird, that implementation could see uninit data, or expose uninit data elsewhere, but the only readers actually used were well behaved readers.

  • Vec::set_len is by no means deprecated. The lint you linked only covers a very specific unsound pattern using set_len.

    • Indeed, and it doesn't need to be deprecated, because it's an API explicitly designed to give you low-level control where you need it, and because it is appropriately defined as an `unsafe` function with documented safety invariants that must be manually upheld in order for usage to be memory-safe. The documentation also suggests several other (safe) functions that should be used instead when possible, and provides correct usage examples: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set... .

      11 replies →