Comment by f311a
3 months ago
People really need to start thinking twice when adding a new dependency. So many supply chain attacks this year.
This week, I needed to add a progress bar with 8 stats counters to my Go project. I looked at the libraries, and they all had 3000+ lines of code. I asked LLM to write me a simple progress report tracking UI, and it was less than 150 lines. It works as expected, no dependencies needed. It's extremely simple, and everyone can understand the code. It just clears the terminal output and redraws it every second. It is also thread-safe. Took me 25 minutes to integrate it and review the code.
If you don't need a complex stats counter, a simple progress bar is like 30 lines of code as well.
This is a way to go for me now when considering another dependency. We don't have the resources to audit every package update.
> People really need to start thinking twice when adding a new dependency. So many supply chain attacks this year.
I was really nervous when "language package managers" started to catch on. I work in the systems programming world, not the web world, so for the past decade, I looked from a distance at stuff like pip and npm and whatever with kind of a questionable side-eye. But when I did a Rust project and saw how trivially easy it was to pull in dozens of completely un-reviewed dependencies from the Internet with Cargo via a single line in a config file, I knew we were in for a bad time. Sure enough. This is a bad direction, and we need to turn back now. (We won't. There is no such thing as computer security.)
The thing is, system based package managers require discipline, especially from library authors. Even in the web world, it’s really distressing when you see a minor library is already on its 15 iteration in less that 5 years.
I was trying to build just (the task runner) on Debian 12 and it was impossible. It kept complaining about rust version, then some libraries shenanigans. It is way easier to build Emacs and ffmpeg.
Indeed, it seems insane that we're pining for the days of autotools, configure scripts and the cleanly inspectable dependency structure.
But... We absolutely are.
2 replies →
Remember the pre package manager days was ossified, archaic, insecure installations because self managing dependencies is hard, and people didn't keep them up to date. You need to get your deps from somewhere, so in the pre-package manager days you still just downloaded it from somewhere - a vendor's web site, or sourceforge, or whatever, and probably didn't audit it, and hoped it was secure. It's still work to keep things up to date and audited, but less work at least.
If most of your deps are coming from the distro, they are audited already. Typically, I never had to add more than a handful of extra deps in any projects I ever worked on. That's a no brainer to manage.
Rust makes me especially nervous due to the possibility of compile-time code execution. So a cargo build invocation is all it could take to own you. In Go there is no such possibility by design.
The same applies to any Makefile, the Python script invoked by CMake or pretty much any other scriptable build system. They are all untrusted scripts you download from the internet and run on your computer. Rust build.rs is not really special in that regard.
Maybe go build doesn't allow this but most other language ecosystems share the same weakness.
12 replies →
You're confusing compile-time with build-time. And build time code execution exists absolutely exists in go, because that's what a build tool is. https://pkg.go.dev/cmd/go#hdr-Add_dependencies_to_current_mo...
5 replies →
Does it really matter, though? Presumably if you're building something is so you can run it. Who cares if the build script is itself going to execute code if the final product that you're going to execute?
1 reply →
Build script isn't a big issue for Rust because there is a simple mitigation that's possible. Do the build in a secure sandbox. Only execution and network access must be allowed - preferably as separate steps. Network access can be restricted to only downloading dependencies. Everything else, including access to the main filesystem should be denied.
Runtime malicious code is a different matter. Rust has a security workgroup and their tools to address this. But it still worries me.
> This is a bad direction, and we need to turn back now.
I don't deny there are some problems with package managers, but I also don't want to go back to a world where it is a huge pain to add any dependency, which leads to projects wasting effort on implementing things themselves, often in a buggy and/or inefficient way, and/or using huge libraries that try to do everything, but do nothing well.
It's a tradeoff. When package users had to manually install dependencies, package developers had to reckon with that friction. Now we're living in a world where developers don't care about another 10^X dependencies, because the package manager will just run the scripts and install the files, and the users will accept it.
Fully agree. That is why I vendor all my dependencies. On the common lisp side a new tool emerged a while ago for that[1].
On top of that, I try to keep the dependencies to an absolute minimum. In my current project it's 15 dependencies, including the sub-dependencies.
[1]: https://github.com/fosskers/vend
I didn't vendor them, but I did do an eyeball scan of every package in the full tree for my project, primarily to gather their license requirements[1]. (This was surprisingly difficult for something that every project in theory must do to meet licensing requirements!) It amounted to approximately 50 dependencies pulled into the build, to create a single gstreamer plugin. Not a fan.
[1] https://github.com/ValveSoftware/Proton/commit/f21922d970888...
Vendoring is nice. Using the system version is nicer. If you can’t run on $current_debian, that’s very much a you problem. If postgres and nginx can do it, you can too.
25 replies →
This isn't as new as you make it out, ant + ivy / maven / gradle had already started this in the 00s. Definitely turned into a mess, but I think the java/cross platform nature pushed this style of development along pretty heavily.
Before this wasn't CPAN already big?
Back as in using less dependencies or throwing bunch of "certifying" services at all of them?
I feel that Rust increases security by avoiding a whole class of bugs (thanks to memory safety), but decreases security by making supply chain attacks easier (due to the large number of transitive dependencies required even for simple projects).
Who is requiring you to use large numbers of transitive dependencies? You can always write all the code yourself instead.
I'm actually really frustrated how hard it's become to manually add, review and understand dependencies to my code. Libraries used to come with decent documentation, now it's just a couple lines of "npm install blah", as if that tells me anything.
[dead]
[dead]
Fully agree.
So many people are so drunk on the kool aid, I often wonder if I’m the weirdo for not wanting dozens of third party libraries just to build a simple HTTP client for a simple internal REST api. (No I don’t want tokio, Unicode, multipart forms, SSL, web sockets, …). At least Rust has “features”. With pip and such, avoiding the kitchen sink is not an option.
I also find anything not extensively used has bugs or missing features I need. It’s easier to fork/replace a lot of simple dependencies than hope the maintainer merges my PR on a timeline convenient for my work.
If you don’t want Tokio I have bad news for you. Rust doesn’t ship an asynchronous runtime. So you’ll need something if you want to run async.
For this specific case an llm may be a good option. You know what you want and could do it yourself but who wants to type it all out? An llm could generate an http client from the socket level on up and it would be straightforward to verify. "Create an http client in $language with basic support for GET and POST requests and outputs the response to STDOUT without any third party libraries. after processing command line arguments the first step should be opening a TCP socket". That should get you pretty far.
2 replies →
Just use your fork until they merge your MR?
There is only one Rust application (server) I use enough that I try to keep up and rebuild it from the latest release every now and then. Most of the time new releases mostly bump versions of some of the 200 or so dependencies. I have no idea how I, or the server code's maintainers, can have any clue what exactly is brought in with each release. How many upgrades times 200 projects before there is a near 100% chance of something bad being included?
The ideal number of both dependencies and releases are zero. That is the only way to know nothing bad was added. Sadly much software seems to push for MORE, not fewer, of both. Languages and libraries keep changing their APIs , forcing cascades of unnecessary changes to everything. It's like we want supply chain attacks to hurt as much as possible.
1 reply →
I think something like cargo vet is the way forward: https://mozilla.github.io/cargo-vet/
Yes, it's a ton of overhead, and an equivalent will be needed for every language ecosystem.
The internet was great too, before it became too monetizable. So was email -- I have fond memories of cold-emailing random professors about their papers or whatever, and getting detailed responses back. Spam killed that one. Dependency chains are the latest victim of human nature. This is why we can't have nice things.
Part of the value proposition for bringing in outside libraries was: when they improve it, you get that automatically.
Now the threat is: when they “improve” it, you get that automatically.
left-pad should have been a major wake up call. Instead, the lesson people took away from it seems to have mostly been, “haha, look at those idiots pulling in an entire dependency for ten lines of code. I, on the other hand, am intelligent and thoughtful because I pull in dependencies for a hundred lines of code.”
The problem is less the size of a single dependency but the transitivity of adding dependencies. It used to be, library developers sought to not depend on other libraries if they could avoid it, because it meant their users had to make their build systems more complicated. It was unusual for a complete project to have a dependency graph more than two levels deep. Package managers let you easily build these gigantic dependency graphs with ease. Great for productivity, not so much for security.
The size itself isn’t a problem, it’s just a rough indicator of the benefit you get. If it’s only replacing a hundred lines of code, is it really worth bringing in a dependency, and as you point out potentially many transitive dependencies, instead of writing your own? People understood this with left-pad but largely seemed unwilling to extrapolate it to somewhat larger libraries.
1 reply →
So, what's the acceptable LOC count threshold for using a library?
Maybe scolding and mocking people isn't a very effective security posture after all.
Time for everybody's favorite engineering answer: it depends! You have to weigh the cost/benefit tradeoff. But you have to do it in full awareness of the costs, including potential costs from packages being taken down, broken, or subverted. In any case, for an external dependency, 100 lines is way too low of a benefit.
I'm not trying to be effective, I'm just lamenting. Maybe being sarcastic isn't a very effective way to get people to be effective?
2 replies →
Scolding and mocking is all we're left with, since two decades worth of rational arguments against these types of hazards have been dismissed as fear-mongering.
3 replies →
Well that's just the difference between a library and building custom.
A library is by definition supposed to be somewhat generic, adaptable and configurable. That takes a lot of code.
I actually loathe those progress trackers. They break emacs shell (looking at you expo and eas).
Why not print a simple counter like: ..10%..20%..30%
Or just: Uploading…
Terminal codes should be for TUI or interactive-only usage.
Carriage returns are good enough for progress bars, and seem to work fine in my emacs shell at least:
works fine for me, and that's with TERM set to "dumb". (I'm actually not sure why it cleared the line automatically though. I'm used to doing "\rmessage " to clear out the previous line.)
Admittedly, that'll spew a bunch of stuff if you're sending it to a pager, so I guess that ought to be
but I still haven't made it to 15 dependencies or 200 lines of code! I don't get a full-screen progress bar out of it either, but that's where I agree with you. I don't want one.
The problem is that two pagers don't do everything that they should do in this regard.
They are supposed to do things like ul utility does, but neither BSD more nor less handle when a CR is emitted to overstrike the line from the beginning. They only handle overstriking characters with BS.
most handles overstriking with CR, though. Your output appears as intended when you page it with most.
* https://jedsoft.org/most/
I feel like not properly supporting widely used escape codes is an issue with the shell, not with the program that uses them
Try mistty
We are using NX heavily (and are not affected) in my teams in a larger insurance company. We have >10 standalone line of business apps and 25+ individual libraries in the same monorepo, managed by NX. I've toyed with other monorepo tools for these kind of complex setup in my career (lerna, rushjs, yarn workspaces) but not only did none came close, lerna is basically handed over to NX, and rushjs is unmaintained.
If you have any proposal how to properly manage the complexity of a FE monorepo with dozens of daily developers involved and heavy CI/CD/Devops integration, please post alternatives - given that security incident many people are looking.
Shameless self-plug and probably not what you're looking for, but anyway: I've created https://github.com/abuob/yanice for that sort of monorepo-size; too many applications/libraries to be able to always run full builds, but still not google-scale or similar.
It ultimately started as a small project because I got fed up with NX' antics a few years back (I think since then they improved quite a lot though), I don't need caching, I don't need their cloud, I don't need their highly opinionated approach on how to structure a monorepository; all I needed was decent change-detection to detect which project changed between the working-tree and a given commit. I've now since added support to enforce module-boundaries as it's definitely a must on a monorepo.
In case anyone wants to try it out - would certainly appreciate feedback!
https://moonrepo.dev/ worked great for our team's setup. It also support bazel remote cache, agnostic to the vendor.
npm workspaces and npm scripts will get you further than you might think. Plenty of people got along fine with Lerna, which didn't do much more than that, for years.
I will say, I was always turned off by NX's core proposition when it launched, and more turned off by whatever they're selling as a CI/CD solution these days, but if it works for you, it works for you.
I'd recommend pnpm over npm for monorepos. Forcing you to be explicit about each package's dependencies is good.
I found npm's workspace features lacking in comparison and sparsely documented. It was also hard to find advice on the internet. I got the sense nobody was using npm workspaces for anything other than beginner articles.
4 replies →
Killer feature of NX is its build cache and the ability to operate on the git staged files. It takes a couple of minutes to build our entire repo on an M4 Pro. NX caches the builds of all libs and will only rebuild those that are affected. Same holds true for linting, prettier, tests etc. Any solution that just executes full builds would be a no-starter for all use cases.
1 reply →
I've burried npm years ago, we are happily using yarn (v4 currently) in that project. Which also means, even if we were affected by the malware, noboday uses the .npmrc (we have a .yarnrc.yml instead) :)
moonrepo is pretty nice
Easier solution: you don’t need a progress bar.
Depends on the purpose… but I guess if you replace it with estimated time left, may be good enough. Sometimes progress bar is just there to identify whether you need stop the job since it takes too much time.
It runs indefinitely to process small jobs. I could log stats somewhere, but it complicates things. Right now, it's just a single binary that automatically gets restarted in case of a problem.
Why not print on stdout, then redirect it to a file?
One of the wisest comments I've ever seen on HN.
Every feature is also a potential vulnerability.
And if you really do? Print the percentage to stdout.
> People really need to start thinking twice when adding a new dependency
I've been preaching this since ~2014 and had little luck getting people on board unless I have full control over a particular team (which is rare). The need to avoid "reinventing the wheel" seems so strong to so many.
I find if I read the source code of a dependency I might add,
it's common that the part that I actually need is like 100 LOC rather than 1500 LOC.
Please keep preaching.
nx is not a random dependency. It's a multi-project management tool, package manager, build tool, and much more. It's backed by a commercial offering. A lot of serious projects use it for managing a lot of different concerns. This is not something silly like leftpad or is-even.
Using languages and frameworks that take a batteries-included approach to design helps a lot here too, since you don’t need to pull in third party code or write your own for every little thing.
It’s too bad that more robust languages and frameworks lost out to the import-world culture that we’re in now.
I’d like a package manager that essentially does a git clone, and a culture that says: “use very few dependencies, commit their source code in your repo, and review any changes when you do an update.” That would be a big improvement to the modern package management fiasco.
Is that realistic though? What you're proposing is letting go of abstractions completely.
Say you need compression, you're going to review changes in the compression code? What about encryption, a networking library, what about the language you're using itself?
That means you need to be an expert on everything you run. Which means no one will be building anything non trivial.
Small, trivial, things, each solving a very specific problem, and that can be fully understood, sounds pretty amazing though. Much better than what we have now.
4 replies →
Yes. I would review any changes to any 3rd party libraries. Why is that unrealistic?
Regarding the language itself, I may or may not. Generally, I pick languages that I trust. E.g. I don't trust Google, but I don't think the Go team would intentionally place malware in the core tools. Libraries, however, often are written by random strangers on the internet with a different level of trust.
4 replies →
That what I used git submodules for. I had a /lib folder in my project where the dependencies were pulled/checked out from. This was before I was doing CI/CD and before folks said git submodules were bad.
Personally, I loved it. I only looked and updating them when I was going to release a new version of my program. I could easily do a diff to see what changed. I might not have understood everything, but it wasn't too difficult to see 10-100 line code changes to get a general idea.
I thought it was better than the big black box we currently deal with. Oh, this package uses this package, and this package... what's different? No idea now, really.
That’s called the original Go package manager and it was pretty terrible
I think it was only terrible because the tooling wasn't great. I think it wouldn't be too terribly hard to build a good tool around this approach, though I admittedly have only thought about it for a few minutes.
I may try to put together a proof of concept, actually.
1 reply →
sounds like the best way to miss critical security upgrades
Why? If you had a package manager tell you "this is out of date and has vulnerability XYZ", you'd do a "gitpkg update" or whatever, and get the new code, review it, and if it passes review, deploy it.
That’s why most mature (as in disciplined) projects have a rss feed or a mailing list. So you know when there’s a security bug and what to do about it.
But here's the catch. If you do that in a lot of places, you'll have a lot of extra code to manage.
So your suggested approach does not seem to scale well.
There's obviously a tradeoff there.
At some level of complexity it probably makes sense to import (and pin to a specific version by hash) a dependency, but at least in the JavaScript ecosystem, that level seems to be "one expression of three tokens" (https://www.npmjs.com/package/is-even).
In pure functional programming like elm and Haskell, it is extremely easy to audit dependencies because any side effect must be explicitly listed, so you just search for those. That makes the risk way lower for dependencies, which is an underrated strength.
I've been saying this for a while, llms will get rid of a lot of libraries, rightly so.
I honestly find in go it’s easier and less code to just write whatever feature you’re trying to implement than use a package a lot of the time.
Compared to typescript where it’s a package + code to use said package which always was more loc than anything comparative I have done in golang.
Without these dependencies there would be no training data so the AI can write your code
I could write it myself. It's trivial, just takes a bit more time, and googling escape sequences for the terminal to move the cursor and clear lines.
And still you looked for a library first.
And do you know what type of code the LLM was trained on? How do you know its sources were not compromised?
Why do I need to know that if I'm an experienced developer and I know exactly what the code is doing? The code is trivial, just print stuff to stdout along with escape sequences to update output.
In this case, yes, but where do you draw the line?