← Back to context

Comment by kouteiheika

1 year ago

> There isn't a single thing there that seems iffy to me.

You mean like four versions of hashbrown (which is useful, but it's rare to have to use it directly instead of `std::collections::HashMap`, never mind pulling four versions of it into your project) or four versions of itertools (which is extremely situational, and even when it is useful it usually only saves you a couple of lines of code, so it's essentially never worth pulling it once, never mind four times)? Or maybe three different crates for random number generation (rand, nanorand, fastrand)?

There's a definitely problem with how the Rust community approaches dependencies (and I say this as someone who loves Rust and uses it as their main language for 10+ years now). People are just way too trigger happy with external dependencies, and burying our heads in the sand is not helping.

Inclusion of every external dependency should always be well motivated. How big is the dependency? How much of it do we use? How big of an effect will it have on compile times? How much effort would it be to write it yourself? Is it security sensitive? Is it a dependency which everyone uses and is maintained by well known community members, or some random guy from who knows where? And so on.

For example, cryptography stuff? No, don't write that yourself if you're not an expert; you'll get it wrong and expose yourself to vulnerabilities. Removing leading whitespace from strings? ("unindent" crate, which is also on your list) Hell no! That's like a minute or two to write this yourself. Did we learn nothing from the left-pad incident?

> You mean like four versions...

The two options for cargo here are 1) fail to compile when there's more than one crate-version in the dep tree or 2) allow for there to be more than one and let the project continue compiling. The former would be more "principled" but in practice incredibly disruptive. I usually go "dep hunting" to unify the versions of duplicated deps. Most of the time that's just looking at `cargo tree` and modifying the `Cargo.toml` slightly. Other times it's not easy, and have to either patch or (better) wait until the diverging dep updates their own `Cargo.toml`.

> People are just way too trigger happy with external dependencies, and burying our heads in the sand is not helping.

>> Inclusion of every external dependency should always be well motivated. How big is the dependency? How much of it do we use? How big of an effect will it have on compile times? How much effort would it be to write it yourself? Is it security sensitive? Is it a dependency which everyone uses and is maintained by well known community members, or some random guy from who knows where? And so on.

We can have a nuanced discussion about dependencies. That's not what I was seeing. There are plenty of things that can be done to improve the situation, specially around Supply Chain Security, but this idea that dependency count is the issue is misguided. It pushes projects towards copy-pasting and vendoring. That makes that code opaque to security tools, existing or proposed. Think of the shitshow it is if you have an app and decided "more dependencies is bad, so I'm copying xz into my repo"?

> Removing leading whitespace from strings? ("unindent" crate, which is also on your list) Hell no! That's like a minute or two to write this yourself.

I don't have access to the closed-source repo to run `cargo tree` to see where `unindent` is used from, but why do you feel this is an invalid crate to pull in? It is a proc-macro, that deindents string literals at compile time. Would I include it directly in a project of mine? Likely not, but if I were using `indoc` (written by dtolnay), which uses `unindent` (written by dtolnay) my reaction wouldn't be "oh, no! An additional useless dependency!".

  • > I don't have access to the closed-source repo to run `cargo tree` to see where `unindent` is used from, but why do you feel this is an invalid crate to pull in?

    Each additional dependency imposes an ongoing audit burden on the downstream consumers of your project.

    In an era supply chain compromises are increasing and the consequences are catastrophic, the security story alters the traditional balance of "roll your own" versus "use the shared library".

    • Which then increases the chance that your homebrew versions have their own security problems (or bugs in general).

  • > but this idea that dependency count is the issue is misguided

    Well, partially you're right. There are roughly two things which are important here:

    1) The number of unique authors/entities controlling the dependencies. (So 10 crates by exactly same author would still count as one dependency.)

    2) The amount of code pulled in by a crate. (Because this tanks your compile times; I've seen projects pulling in hundreds of thousands of lines of code in external dependencies and using less that 1% of that, and then people make surprised pikachu face that Rust is slow to compile.)

    > I don't have access to the closed-source repo to run `cargo tree` to see where `unindent` is used from, but why do you feel this is an invalid crate to pull in? It is a proc-macro, that deindents string literals at compile time. Would I include it directly in a project of mine? Likely not, but if I were using `indoc` (written by dtolnay), which uses `unindent` (written by dtolnay) my reaction wouldn't be "oh, no! An additional useless dependency!".

    I would never include either in any of my projects, and would veto any attempt to do so. As I already said, the 'unindent' crate is trivial to write by myself, and the 'indoc' crate seems completely not worth it from a cost/benefit standpoint in the very rare case I'd need something like that (it's easy enough to make do without it, as it's just a minor situational quality of life crate).

    In general my policy on external dependencies is stricter than most people; I usually only include high value/high impact dependencies, and I try to evaluate whether a given dependency is appropriate in context of the concrete project I want to use it in. If it's a throwaway script that I need to run once and won't really maintain long-term - I go crazy with gluing whatever external crates there are just to get it done ASAP! But if it's a project that I'll need to maintain over a long period of time I get a lot more strict, and if it's a library that I expect other people to use then the bar for external dependencies gets even higher (because any extra dependency I add will bloat up the compile times and the dependency trees of any downstream users).

    I also find it helpful to ask myself the question - if it wasn't easy to add new dependencies (e.g. if I was still writing in C++, or cargo wasn't a thing) would I still include this dependency in my project? If the answer is "no" then maybe it's better not to.

    There are some notable exceptions, but sadly most of the Rust community doesn't do things this way.

  • there's 2 kinds of bugs related to security: accidental bugs, and maliciously injected bugs. xz was the second time (which you could have avoided if you vendored starting at a reviewed / trusted point in time...)

    from empirical studies, we know the first kind occurs at roughly the same rate everywhere, so it's just do you have capacity to fix it. also, reusable dependencies typically are more configurable which leads to more code and more bugs, many of which might not have affected you if you didn't need all the flexibility.

    dependency count is an indirect measure of the second kind, except rust pushes crates as the primary metric, so it will always look bad compared to if it pushed something more reasonable like the number of trust domains.