← Back to context

Comment by Someone

14 days ago

If you want to support file I/O in the standard library, you have to choose _some_ API, and that either is limited to the features common to all platforms, or it covers all features, but call that cannot be supported return errors, or you pick a preferred platform and require all other platforms to try as hard as they can to mimic that.

Almost all languages/standard libraries pick the latter, and many choose UNIX or Linux as the preferred platform, even though its file system API has flaws we’ve known about for decades (example: using file paths too often) or made decisions back in 1970 we probably wouldn’t make today (examples: making file names sequences of bytes; not having a way to encode file types and, because of that, using heuristics to figure out file types. See https://man7.org/linux/man-pages/man1/file.1.html)

You have to choose something, and I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level"). You can usually abstract away platform weirdness at the implementation level, but programming on non-Unix environments it's more like programming against cygwin.

A standard library for files and paths that lacks things like ACLs and locks is weirdly Unixy for a supposedly modern language. Most systems support ACLs now, though Windows uses them a lot more. On the other hand, the lack of file descriptors/handles is weird from all points of view.

Had Windows been an uncommon target, I would've understood this design, but Windows is still the most common PC operating system in the world by a great margin. Not even considering things like "multile filesystem roots" (drive letters) "that happen to not exist on Linux", or "case insensitive paths (Windows/macOS/some Linux systems)" is a mistake for a supposedly generic language, in my opinion.

  • As far as I can tell from Microsoft's documentation, WinAPI access for ACLs was added in Windows 10, which Rust 1.0 predates. And std::fs attempts to provide both minimalist and cross-platform APIs, which in practice means (for better or worse) it's the lowest common denominator between Windows and Unix, with the objective being that higher-level libraries can leverage it as a building block. From the documentation for std::fs:

    "This module contains basic methods to manipulate the contents of the local filesystem. All methods in this module represent cross-platform filesystem operations. Extra platform-specific functionality can be found in the extension traits of std::os::$platform."

    Following its recommendation, if we look at std::os::windows::fs we see an extension trait for setting Windows-specific flags for WinAPI-specific flags, like dwDesiredAccess, dwShareMode, dwFlagsAndAttributes. I'm not a Windows dev but AFAICT we want an API to set lpSecurityAttributes. I don't see an option for that in std::os::windows::fs, likely complicated by the fact that it's a pointer, so acquiring a valid value for that parameter is more involved than just constructing a bitfield like for the aforementioned parameters. But if you think this should be simple, then please propose adding it to std::os::windows::fs; the Rust stdlib adds new APIs all the time in response to demand. (In the meantime, comprehensive Windows support is generally provided by the de-facto standard winapi crate, which provides access to the raw syscall).

    • > WinAPI access for ACLs was added in Windows 10

      I'm not sure which docs you mean but that's not true. The NT kernel has used ACLs long before rust was invented. But it's indeed true that rust adds platform-specific methods based on demand. The trouble with ACLs is it means either creating a large API surface in the standard library to handle them or else presenting a simple interface but having to manage raw pointers (likely using a wrapper type but even then it can't be made totally safe).

      > the de-facto standard winapi crate, which provides access to the raw syscall

      Since the official Microsoft `windows-sys` crate was released many years ago, the winapi crate has been effectively unmaintained (it accepts security patches but that's it).

      3 replies →

    • SetFileSecurityA is listed as Windows XP+ (https://learn.microsoft.com/en-us/windows/win32/api/winbase/...) but Microsoft has deprecated all pre-XP documentation.

      According to https://www.geoffchappell.com/studies/windows/win32/advapi32..., the function was available first in advapi32 version 3.10, which was included in Windows NT 3.10 (14th July 1993): https://www.geoffchappell.com/studies/windows/win32/advapi32...

      lpSecurityAttributes just refers to a SecurityAttributes struct (Rust bindings here: https://microsoft.github.io/windows-docs-rs/doc/windows/Win3...) Annoying pointers for sure, but nothing a Rust API can't work around with standard language features.

      And sure, Rust could add the entire windows crate to the standard library, but my point is that this isn't just Windows functionality: getfacl/setfacl has been with us for decades but I don't know any standard library that tries to include any kind of ACLs.

    • You misunderstand the documentation. Microsoft doesn't provide online documentation for versions of Windows that are no longer supported. Functions like SetFileSecurity have existed since Windows NT 3.1 back in 1993.

      2 replies →

  • > I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level")

    Can you expound a bit on this? I haven't been able to find any articles related to this kind of problem. It's also a bit surprising, given that Go specifically did not make the same choice as Rust to make strings be Unicode / UTF-8 (Go strings are just arrays of bytes, with one minor exception related to iteration using the range syntax).

    • Go's docs put it like this: Path names are UTF-8-encoded, unrooted, slash-separated sequences of path elements, like “x/y/z”. If you operate on a path that's a non-UTF-8 string, then Go will do... something to make the string work with UTF-8 when passed back to standard file methods, but it likely won't end up operating on the same file.

      Rust has OsStr to represent strings like paths, with a lossy/fallible conversion step instead.

      Go's approach is fine for 99% of cases, and you're pretty screwed if your application falls for the 1% issue. Go has a lot of those decisions, often to simplify the standard library for most use cases most people usually run into (like their awful, lossy, incomplete conversion between Unix and Windows when it comes to permissions/read-only flags/etc.).

      2 replies →

    • > Go strings are just arrays of bytes,

      https://go.dev/ref/spec#String_types: “A string value is a (possibly empty) sequence of bytes”

      https://pkg.go.dev/strings@go1.26.2: “Package strings implements simple functions to manipulate UTF-8 encoded strings.”

      So, yes, Go strings are just arrays of bytes in the language, but in the standard library, they’re supposed to be UTF-8 (the documentation isn’t immediately clear on how it handles non-UTF-8 strings).

      I think this may be why the OP thinks the Go approach is “every path is a valid UTF-8 string”