Comment by gbuk2013

2 hours ago

I came across this interesting rant the other day: https://github.com/uNetworking/uWebSockets.js/blob/master/mi...

It does make sense that the right way would be to fork every dependency you use and install from your own repo reviewing and merging from upstream as needed. Would be a giant PITA though. :)

The problem would be the dependencies of your dependencies, and keep going many levels.

  • The problem is that node.js doesn't have a good standard library so one must rely on external packages to build even basic apps.

  • But presumably, you only include dependencies that you trust and those dependencies themselves do their trusting more strictly than you. Trust is built on vetting, signatures and reputation.

    That is, at least what we do, in theory. In practice, we cross fingers and let the LLM pick dependencies, are satisfied if it just works and we either update our deps frequently or infrequently.

Nothing that couldn't be automated; in Go land this is (arguably) called vendoring (https://go.dev/ref/mod#vendoring). Good to offload or reduce dependencies on 3rd party dependency hosters, pull a dependency into your own code review tools, and to ensure reproducible builds long term.

  • It's a generic and very old term for committing dependencies to your repo, it's not go-specific.

  • I mean there’s nothing stopping you from committing node_modules to git (after running something like https://github.com/timoxley/cruft on it) and reviewing code changes on dependency updates.

    I even managed to make that part of the workflow on one team I worked with but several other teams since thought it was a crazy idea. :)

That might change the odds, but unless you fork diligently (and monkeypatch each and every future vulnerability) you might ship a compromised fork forever.

  • Except most of the attacks so far has not landed actually source code changes to git IIRC. They have targeting the release files directly.

    • Software vulnerabilities are often not placed maliciously, and are present in the original source. If you don't patch them if discovered later, you'll be vulnerable to them.

The problem is the volume of dependencies. Most modern JavaScript, Python, Rust, Go, etc. projects have many dozens of transitive dependencies.

  • Hum... You should check your JavaScript numbers again.

    I have never seen a project that uses npm and has only dozens of dependencies. Normal numbers are in the 10s of thousands (including different versions of some deps).

  • Let’s not ignore that dependencies are far more common in JS than any of those other languages. My Go or Python projects generally only include a handful of external packages. Node projects on the other hand…

I think the general idea that your supply chain should be rooted in source repositories and associated commit hashes is the right one. Tooling can be made to automate the process of putting together a product from those defined sources. Some languages/systems already have some support for this. E.g. Golang and Rust. The concept of a "binary" artifact is really dead now everyone uses git and builds are quick. It lives on in things like npm and docker hub but we don't actually need it.

For smaller shops (by small I mean <1,000 employees) this isn't even tenable. We (engineering team of about 10 people) mitigate what we can via tooling and cooldown periods/minimum release age. This will work as long as these malicious packages remain reasonably detectable. I think that's the proper balance, because we can adjust the # of days we are willing to risk against the SOTA of detection tooling.