← Back to context

Comment by coldpie

3 months ago

> People really need to start thinking twice when adding a new dependency. So many supply chain attacks this year.

I was really nervous when "language package managers" started to catch on. I work in the systems programming world, not the web world, so for the past decade, I looked from a distance at stuff like pip and npm and whatever with kind of a questionable side-eye. But when I did a Rust project and saw how trivially easy it was to pull in dozens of completely un-reviewed dependencies from the Internet with Cargo via a single line in a config file, I knew we were in for a bad time. Sure enough. This is a bad direction, and we need to turn back now. (We won't. There is no such thing as computer security.)

The thing is, system based package managers require discipline, especially from library authors. Even in the web world, it’s really distressing when you see a minor library is already on its 15 iteration in less that 5 years.

I was trying to build just (the task runner) on Debian 12 and it was impossible. It kept complaining about rust version, then some libraries shenanigans. It is way easier to build Emacs and ffmpeg.

  • Indeed, it seems insane that we're pining for the days of autotools, configure scripts and the cleanly inspectable dependency structure.

    But... We absolutely are.

Remember the pre package manager days was ossified, archaic, insecure installations because self managing dependencies is hard, and people didn't keep them up to date. You need to get your deps from somewhere, so in the pre-package manager days you still just downloaded it from somewhere - a vendor's web site, or sourceforge, or whatever, and probably didn't audit it, and hoped it was secure. It's still work to keep things up to date and audited, but less work at least.

  • If most of your deps are coming from the distro, they are audited already. Typically, I never had to add more than a handful of extra deps in any projects I ever worked on. That's a no brainer to manage.

Rust makes me especially nervous due to the possibility of compile-time code execution. So a cargo build invocation is all it could take to own you. In Go there is no such possibility by design.

  • The same applies to any Makefile, the Python script invoked by CMake or pretty much any other scriptable build system. They are all untrusted scripts you download from the internet and run on your computer. Rust build.rs is not really special in that regard.

    Maybe go build doesn't allow this but most other language ecosystems share the same weakness.

    • Yes but it's the fact that cargo can pull a massive unreviewed dependency tree and then immediately execute code from those dependencies that's the problem. If you have a repo with a Makefile you have the opportunity to review it first at least.

      8 replies →

  • You're confusing compile-time with build-time. And build time code execution exists absolutely exists in go, because that's what a build tool is. https://pkg.go.dev/cmd/go#hdr-Add_dependencies_to_current_mo...

    • I think you're misunderstanding.

      "go build" of arbitrary attacker controlled go code will not lead to arbitrary code execution.

      If you do "git clone attacker-repo && cargo build", that executes "build.rs" which can exec any command.

      If you do "git clone attacker-repo && go build", that will not execute any attacker controlled commands, and if it does it'll get a CVE.

      You can see this by the following CVEs:

      https://pkg.go.dev/vuln/GO-2023-2095

      https://pkg.go.dev/vuln/GO-2023-1842

      In cargo, "cargo build" running arbitrary code is working as intended. In go, both "go get" and "go build" running arbitrary code is considered a CVE.

      3 replies →

    • I don't really get what you're trying to say, go get does not execute arbitrary code.

  • Does it really matter, though? Presumably if you're building something is so you can run it. Who cares if the build script is itself going to execute code if the final product that you're going to execute?

    • With a scripting language it can matter: If I install some package I can review after the install before running or run in a container or other somewhat protected ground. Whereas anything running during install can hide all trades.

      Of course this assumption breaks with native modules and with the sheer amount of code being pulled in indirectly ...

  • Build script isn't a big issue for Rust because there is a simple mitigation that's possible. Do the build in a secure sandbox. Only execution and network access must be allowed - preferably as separate steps. Network access can be restricted to only downloading dependencies. Everything else, including access to the main filesystem should be denied.

    Runtime malicious code is a different matter. Rust has a security workgroup and their tools to address this. But it still worries me.

> This is a bad direction, and we need to turn back now.

I don't deny there are some problems with package managers, but I also don't want to go back to a world where it is a huge pain to add any dependency, which leads to projects wasting effort on implementing things themselves, often in a buggy and/or inefficient way, and/or using huge libraries that try to do everything, but do nothing well.

  • It's a tradeoff. When package users had to manually install dependencies, package developers had to reckon with that friction. Now we're living in a world where developers don't care about another 10^X dependencies, because the package manager will just run the scripts and install the files, and the users will accept it.

Fully agree. That is why I vendor all my dependencies. On the common lisp side a new tool emerged a while ago for that[1].

On top of that, I try to keep the dependencies to an absolute minimum. In my current project it's 15 dependencies, including the sub-dependencies.

[1]: https://github.com/fosskers/vend

  • I didn't vendor them, but I did do an eyeball scan of every package in the full tree for my project, primarily to gather their license requirements[1]. (This was surprisingly difficult for something that every project in theory must do to meet licensing requirements!) It amounted to approximately 50 dependencies pulled into the build, to create a single gstreamer plugin. Not a fan.

    [1] https://github.com/ValveSoftware/Proton/commit/f21922d970888...

  • Vendoring is nice. Using the system version is nicer. If you can’t run on $current_debian, that’s very much a you problem. If postgres and nginx can do it, you can too.

    • The system package manager and the language package/dependency managers do a very different task.

      The distro package manager delivers applications (like Firefox) and a coherent set of libraries needed to run those applications.

      Most distro package managers (except Nix and its kin) don't allow you to install multiple versions of a library, have libs with different compile time options enabled (or they need separate packages for that). Once you need a different version of some library than, say, Firefox does, you're out of luck.

      A language package manager by contrast delivers your dependency graph, pinned to certain versions you control, to build your application. It can install many different versions of a lib, possibly even link them in the same application.

      2 replies →

    • > If you can’t run on $current_debian, that’s very much a you problem.

      This is a reasonable position for most software, but definitely not all, especially when you fix a bug or add a feature in your dependent library and your Debian users (reasonably!) don't want to wait months or years for Debian to update their packages to get the benefits. This probably happens rarely for stable system software like postgres and nginx, but for less well-established usecases like running modern video games on Linux, it definitely comes up fairly often.

      2 replies →

    • That is an impossible task in practice for most developers.

      Many distros, and Debian in particular, apply extensive patches to upstream packages. Asking a developer to depend on every possible variation of such packages, across many distros, is a tall order. Postgres and Nginx might be able to do it, but those are established projects with large teams behind them and plenty of leverage. They might even be able to influence distro maintainers to their will, since no distro will want to miss out on carrying such popular packages.

      So vendoring is in practice the only sane choice for smaller teams and projects.

      Besides, distro package managers carrying libraries for all programming languages is an insane practice that is impossible to scale and maintain. It exists in this weird unspecified state that can technically be useful for end users, but is completely useless for developers. Are they supposed to develop on a specific distro for some reason? Should it carry sources or only binaries? Is the dependency resolution the same for all languages? Should language tooling support them? It's an entirely ridiculous practice that should be abandoned altogether.

      Yes, it's also silly that every language has to reinvent the wheel for managing dependencies, and that it can introduce novel supply chain attack vectors, but the alternative is a far more ludicrous proposition.

      15 replies →

    • But that would lock me in to say whatever $debian provides. And some dependencies only exist as source because they are not packaged for $distribution.

      Of course, if possible, just saying "hey, I need these dependencies from the system" is nicer, but also not error-free. If a system suddenly uses an older or newer version of a dependency, you might also run into trouble.

      In either case, you run into either an a) trust problem or b) a maintenance problem. And in that scenario I tend to prefer option b), at least I know exactly whom to blame and who is in charge of fixing it: me.

      Also comes down to the language I guess. Common Lisp has a tendency to use source packages anyway.

      2 replies →

This isn't as new as you make it out, ant + ivy / maven / gradle had already started this in the 00s. Definitely turned into a mess, but I think the java/cross platform nature pushed this style of development along pretty heavily.

Before this wasn't CPAN already big?

Back as in using less dependencies or throwing bunch of "certifying" services at all of them?

I feel that Rust increases security by avoiding a whole class of bugs (thanks to memory safety), but decreases security by making supply chain attacks easier (due to the large number of transitive dependencies required even for simple projects).

  • Who is requiring you to use large numbers of transitive dependencies? You can always write all the code yourself instead.

I'm actually really frustrated how hard it's become to manually add, review and understand dependencies to my code. Libraries used to come with decent documentation, now it's just a couple lines of "npm install blah", as if that tells me anything.

Fully agree.

So many people are so drunk on the kool aid, I often wonder if I’m the weirdo for not wanting dozens of third party libraries just to build a simple HTTP client for a simple internal REST api. (No I don’t want tokio, Unicode, multipart forms, SSL, web sockets, …). At least Rust has “features”. With pip and such, avoiding the kitchen sink is not an option.

I also find anything not extensively used has bugs or missing features I need. It’s easier to fork/replace a lot of simple dependencies than hope the maintainer merges my PR on a timeline convenient for my work.

  • If you don’t want Tokio I have bad news for you. Rust doesn’t ship an asynchronous runtime. So you’ll need something if you want to run async.

  • For this specific case an llm may be a good option. You know what you want and could do it yourself but who wants to type it all out? An llm could generate an http client from the socket level on up and it would be straightforward to verify. "Create an http client in $language with basic support for GET and POST requests and outputs the response to STDOUT without any third party libraries. after processing command line arguments the first step should be opening a TCP socket". That should get you pretty far.

  • There is only one Rust application (server) I use enough that I try to keep up and rebuild it from the latest release every now and then. Most of the time new releases mostly bump versions of some of the 200 or so dependencies. I have no idea how I, or the server code's maintainers, can have any clue what exactly is brought in with each release. How many upgrades times 200 projects before there is a near 100% chance of something bad being included?

    The ideal number of both dependencies and releases are zero. That is the only way to know nothing bad was added. Sadly much software seems to push for MORE, not fewer, of both. Languages and libraries keep changing their APIs , forcing cascades of unnecessary changes to everything. It's like we want supply chain attacks to hurt as much as possible.