Comment by cxr
2 days ago
At this point, the existence of these attacks should be an expected outcome. (It should have been expected even without the empirical record we now have and the multiple times that we can now cite.)
NPM and NPM-style package managers that are designed to late-fetch dependencies just before build-time are already fundamentally broken. They're an end-run around the underlying version control system, all in favor of an ill-considered, half-baked scheme to implement an alternative approach to version control of the package manager project maintainers' devising.
And they provide cover for attacks like this, because they encourage a culture where, because one's dependencies are all "over there", the massive surface area gets swept under the rug and they never get reviewed (because 56K NPM users can't be wrong).
I am slowly waking up to the realization that we (software engineers) are laughably bad at security. I used to think that it was only NPM (I have worked a lot in this ecosystem over the years), but I have found this to be essentially everywhere: NPM is a poster child for this because of executable scripts on install, but every package manager essentially boils down to "Install this thing by name, no security checks". Every ecosystem I touch now (apart from gamedev, but only because I roll everything myself there by choice) has this - e.g Cargo has a lot of "tools" that you install globally so that you get some capability (like flamegraphs, asm output, test runners etc.) - this is the same vulnerability, manifesting slightly differently. Like others have pointed out, it is common to just pull random Docker images via Helm charts. It is also common to get random "utility" tools during builds in CI/CD pipelines, just by curl-ing random URLs of various "release archives". You don't even have to look too hard - this is surface level in pretty much every company, almost every industry (I have my doubts about the security theatre in some, but I have no first hand experience, so cannot say)
The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water. On the other hand, I feel like nothing I touch is "secure" in any real sense - the tick boxes are there, and they are all checked, but I don't think a single one of them really protects me against anything - most of the time, the monster is already inside the house.
>The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water.
Is NPM really collaborative? People just throw stuff out there and you can pick it up. It's the least commons denominator of collaboration.
The thing that NPM is missing is trust and trust doesn't scale to 1000x dependencies.
IMO the solution is auditing. We should be auditing every single version of every single dependency before we use it. Not necessarily personally, but we could have a review system like Ebay/Uber/AirBnB and require N trusted reviews.
This is the way. But people read it, nod their heads, and then go back to yolo'ing dependencies into their project without reading them. Culture change is needed.
1 reply →
Something that I keep thinking about is spec driven design.
If, for code, there is a parallel "state" document with the intent behind each line of code, each function
And in conjunction that state document is connected to a "higher layer of abstraction" document (recursively up as needed) to tie in higher layers of intent
Such a thing would make it easier to surface weird behavior imo, alongside general "spec driven design" perks. More human readable = more eyes, and potential for automated LLM analysis too.
I'm not sure it'd be _Perfect_, but I think it'd be loads better than what we've got now
I think the solution is a build system that requires version pinning - options include Nix, Bazel, and Buck.
I am a big fan of Bazel and have explored Nix (although, regrettably not used it in anger quite yet) - both seem like good steps in the right direction and something I would love to see more usage/evolution of. However, it is important to recognize that these tools have a steep learning curve and require deep knowledge in more than one aspect in order to be used effectively/at all.
Speed of development and development experience are not metrics to be minimized/discarded lightly. If you were to start a company/product/project tomorrow, a lot of the things you want to be doing in the beginning are not related to these tools. You probably, most of the time, want to be exploring your solution space. Creating a development and CI/CD environment that can fully take advantage of these tools capabilities (like hermeticity and reproducibility) is not straightforward - in most cases setting up, scaling and maintaining these often requires a whole team with knowledge that most developers won't have. You don't want to gatekeep the writing of new software behind such requirements. But I do agree that the default should be closer to this, than what we have today. How we get there - now that is the million dollar question.
Back in the days of Makefilea and autoconf, we tended to require specific versions and would document that in the readme.
Unless you audit the version you're pinning, what's the difference?
I agree with much of what you said here, but is it really just about the package manager? If I had specified this repo's git url with a specific version number or sha directly in my package.json, the outcome would be just about the same. And so that's not really an end-run around version control at that point. Even with npm out of the picture the problem is still there.
> If I had specified this repo's git url with a specific version number or sha directly in my package.json[…] that's not really an end-run around version control at that point
Yes it is. Git doesn't operate based on package.json.
You're still trying to devise a scheme where, instead of Git tracking the source code of what you're building and deploying and/or turning into a release, you're excluding parts of that content from Git's purview. That's doing an end-run around the VCS.
It's hardly an end-run around VCS to specify an external dependency's VCS sha, and resolve that at build time.
But okay, let's go further and use git submodules so that package.json is out of the picture. Even in that case we have the same problem.
Or, let's go even further and vendor the dependency so it is now copied into our source code. Even in that case too, we still have the same problem.
The dependency has been malicious all along, so if we use it in any way the game is already over.
4 replies →
The root problem is the OS allows npm packages to grab your WhatsApp messages without the user knowing.
The OS isn't allowing anything as far as I can see. It's a fork of a library that allows you to use the WhatsApp API, it actually works, it also just happen to also harvest your credentials and messages.
Should the OS prevent you from doing API calls to WhatsApps servers? What about the actual library this is based on, should that be blocked as well?
The root of the problem is that users and developers may have legitimate reasons to want API access to a service, like WhatsApp. That just comes with a level of risk. Especially in a world where we're not use to auditing our dependencies. The only sort of maybe solution I can see is the operating system prompting you when an application want's to make an outgoing request, but in this case the messages might just go to AWS and an S3 bucket, or it could send them via WhatsApp to the attack, how would you spot that in the operating system, without built in knowledge of WhatsApp specifically?
This is an npm package that allows you to interact with WhatsApp using their API. The OS wouldn’t prevent this as it’s not interacting with your WhatsApp on your machine, but rather logging you in via a skillfully made 3rd party interface, that unfortunately happens to also be evil.
There are so many package managers out there for different platforms. I feel like there should be some more general, standardized, package manager that is language agnostic. Something that: - has some guarantees about dependencies - has some guarantees about provenance (only allow if signed by x, y, z kind of thing) - has a standardized api so corporate or third party curation of packages is possible (I want my own company package manager that I curate) - does ????
I don't know, it just seems like every tech area has these problems and I honestly don't understand why there aren't more 'standardized' solutions here
They exist and are called "linux distributions". Developers hate them.
> They're an end-run around the underlying version control system
I assume by "underlying version control system" you mean apt, rpm, homebrew and friends? They don't solve this problem either. Nobody in the opensource world is auditing code for you. Compromised xz still made it into apt. Who knows how many other packages are compromised in a similar way?
Also, apt and friends don't solve the problem that npm, cargo, pip and so on solve. I'm writing some software. I want to depend on some package X at version Y (eg numpy, serde, react, whatever). I want to use that package, at that version, on all supported platforms. Debian. Ubuntu. Redhat. MacOS. And so on. Try and do that using the system package manager and you're in a world of hurt. "Oh, your system only has official packages for SDL2, not SDL3. Maybe move your entire computer to an unustable branch of ubuntu to fix it?" / "Yeah, we don't have that python package in homebrew. Maybe you could add it and maintain it yourself?" / "New ticket: I'm trying to run your software in gentoo, but it only has an earlier version of dependency Y."
Hell. Utter hell.
No, other trusted repositories are legitimately better because the maintainers built the software themselves. They don't purely rely on binaries from the original developer.
It's not perfect and bad things still make it through, but just look at your example - XZ. This never made it into Debian stable repositories and it was caught remarkably quickly. Meanwhile, we have NPM vulnerability after vulnerability.
Npm is all source based. Nobody is compiling binaries of JavaScript libraries. Cargo is the same.
I’m not really sure what you think a maintainer adds here. They don’t audit the code. A well written npm or cargo or pip module works automatically on all operating systems. Why would we need or want human intervention? To what? Manually add each package to N other operating systems? Sounds like a huge waste of time. Especially given the selection of packages (and versions of those packages) in every operating system will end up totally different. It’s a massive headache if you want your software to work on multiple Linux distros. And everyone wants that.
Npm also isn’t perfect. But npm also has 20x as many packages as apt does on Ubuntu (3.1M vs 150k). I wouldn’t be surprised if there is more malicious code on npm. Until we get better security tools, its buyer beware.
But do they audit the code? I say mostly no. They grab the source, try to compile it. Develop patches to fix problems on the specific platform. Once it works, passes the tests, it's done. Package created, added to the repo.
Even OpenBSD, famous for auditing their code, doesn't audit packages. Only the base system.
1 reply →
> I assume by "underlying version control system" you mean apt, rpm, homebrew and friends
No. Git.
...unless your system package manager is nix.
What is so special about nix that it avoids all these issues?
4 replies →
I think you missed the mark a bit here. This wasn’t a dependency that was compromised, it was a dep that was malicious from the start. Package manager doesn’t really play into this. Even if this package was vendored the outcome would have been the same.
No, package manager actually DOES play into this. Or, rather, the way best practices it enforces do. I would be seriously surprised if debian shipped malware, because the package manager is configured with debian repos by default and you know you can trust these to have a very strict oversight.
If apt's DNA was to download package binaries straight from Github, then I would blame it on the package manager for making it so inherently easy to download malware, wouldn't I?
> I think you missed the mark a bit here. This wasn’t a dependency that was compromised, it was a dep that was malicious from the start.
You're making assumptions that I am making assumptions, but I wasn't making assumptions. I understand the attack.
> Package manager doesn’t really play into this.
It does, for the reasons I described.