← Back to context

Comment by cesarb

2 days ago

> Why can't Cargo have a system like PyPI where library author uploads compiled binary

Unless you have perfect reproducible builds, this is a security nightmare. Source code can be reviewed (and there are even projects to share databases of already reviewed Rust crates; IIRC, both Mozilla and Google have public repositories with their lists), but it's much harder to review a binary, unless you can reproducibly recreate it from the corresponding source code.

> Unless you have perfect reproducible builds

Or a trusted build server doing the builds. There is a build-bot building almost every Rust crate already for docs.rs.

  • docs.rs is just barely viable because it only has to build crates once (for one set of features, one target platform etc.).

    What you propose would 1) have to build each create for at least the 8 Tier 1 targets, if not also the 91 Tier 2 targets. That would be either 8 or 99 binaries already.

    Then consider that it's difficult to anticipate which feature combinations a user will need. For example, the tokio crate has 14 features [1]. Any combination of 14 different features gives 2^14 = 16384 possible configurations that would all need to be built. Now to be fair, these feature choices are not completely independent, e.g. the "full" feature selects a bunch of other features. Taking these options out, I'm guessing that we will end up with (ballpark) 5000 reasonable configurations. Multiply that by the number of build targets, and we will need to build either 40000 (Tier 1 only) or 495000 binaries for just this one crate.

    Now consider on top that the interface of dependency crates can change between versions, so the tokio crate would either have to pin exact dependency versions (which would be DLL hell and therefore version locking is not commonly used for Rust libraries) or otherwise we need to build the tokio crate separately for each dependency version change that is ABI-incompatible somewhere. But even without that, storing tens of thousands of compiled variants is very clearly untenable.

    Rust has very clearly chosen the path of "pay only for what you use", which is why all these library features exist in the first place. But because they do, offering prebuilt artifacts is not viable at scale.

    [1] https://github.com/tokio-rs/tokio/blob/master/tokio/Cargo.to...

I don’t think it’s that much of a security nightmare: the basic trust assumption that people make about the packaging ecosystem (that they trust their upstreams) remains the same whether they pull source or binaries.

I think the bigger issues are probably stability and size: no stable ABI combined with Rust’s current release cadence means that every package would essentially need to be rebuilt every six weeks. That’s a lot of churn and a lot of extra index space.

  • If you have reproducible builds it's no different. Without those binaries are a nightmare in that you can't easily link a given binary back to a given source snapshot. Deciding to trust my upstream is all well and good but if it's literally impossible to audit them that's not a good situation to be in.

    • I think it’s already probably a mistake to think that a source distribution consistently references a unique upstream source repository state; I don't believe the crate distribution layout guarantees this.

      (I agree that source is easier to review and establish trust in; the observation is that once you read the upstream source you’re in the same state regarding distributors, since build and source distributions both modify the source layout.)

  • > remains the same whether they pull source or binaries.

    I don't think that's exactly true, it's definitely _easier_ to sneak something into a binary without people noticing than it is to sneak it into rust source, but there hasn't been an underhanded rust competition for a while so I guess it's hard to be objective about that.

    • Pretty much nobody does those two things at the same time:

      - pulling dependencies with cargo - auditing the source code of the dependencies they're building

      You are either censoring and vetting everything or you're using dependencies from crates.io (ideally after you've done your due diligence on the crate), but should crates.io be compromised and inject malware in the crates' payload, I'm ready to bet nobody would notice for a long time.

      I fully agree with GP that binary vs source code wouldn't change anything in practice.

      8 replies →

  • No stable ABI doesn't mean the ABI changes at every release though.

    • It might as well. If there is no definition of an ABI, nobody is going to build the tooling and infrastructure to detect ABI compatibility between releases and leverage that for the off-chance that e.g. 2 out of 10 successive Rust releases are ABI compatible.

      2 replies →