Comment by chatmasta
4 months ago
My headcanon is that Docker exists because Python packaging and dependency management was so bad that dotCloud had no choice but to invent some porcelain on top of Linux containers, just to provide a pleasant experience for deploying Python apps.
That's basically correct. But the more general problem is that engineers simply lost the ability to succinctly package applications and their dependencies into simple to distribute and run packages. Somehow around the same time Java made .jar files mainstream (just zip all the crap with a manifest), the rest of the world completely forgot how to do the equivalent of statically linking in libraries and that we're all running highly scheduled multithreaded operating systems now.
The "solution" for a long time was to spin up single application Virtual Machines, which was a heavy way to solve it and reduced the overall system resources available to the application making them stupidly inefficient solutions. The modern cloud was invented during this phase, which is why one of the base primitives of all current cloud systems is the VM.
Containers both "solved" the dependency distribution problem as well as the resource allocation problem sort of at once.
> engineers simply lost the ability to succinctly package applications and their dependencies into simple to distribute and run packages.
but this is what docker is
If anything, java kinda showed it doesn't have to suck, but as not all things are java, you need something more general
With the difference that with docker you are shipping the runtime to your source code as well.
5 replies →
And even a java program may need a system wide install of ffmpeg or opencv or libgtk or VC runtime 2019 but not 2025 or some other dependency.
And sometimes you want to ship multiple services together.
In any case 'docker run x' is easier and seemingly less error prone than a single sudo apt get install
I would argue that the traditional way to install applications (particularly servers) on UNIX wasn’t very compatible with the needs that arose in the 2000s.
The traditional way tends to assume that there will be only one version of something installed on a system. It also assumes that when installing a package you distribute binaries, config files, data files, libraries and whatnot across lots and lots of system directories. I grew up on traditional UNIX. I’ve spent 35+ years using perhaps 15-20 different flavors of UNIX, including some really, really obscure variants. For what I did up until around 2000, this was good enough. I liked learning about new variants. And more importantly: it was familiar to me.
It was around that time I started writing software for huge collections of servers sitting in data centers on a different continent. Out of necessity I had to make my software more robust and easier to manage. It had to coexist with lots of other stuff I had no control over.
It would have to be statically linked, everything I needed had to be in one place so you could easily install and uninstall. (Eventually in all-in-one JAR files when I started writing software in Java). And I couldn’t make too many assumptions about the environment my software was running in.
UNIX could have done with a re-thinking of how you deal with software, but that never happened. I think an important reason for this is that when you ask people to re-imagine something, it becomes more complex. We just can’t help ourselves.
Look at how we reimagined managing services with systemd. Yes, now that it has matured a bit and people are getting used to it, it isn’t terrible. But it also isn’t good. No part of it is simple. No part of it is elegant. Even the command line tools are awkward. Even the naming of the command line tools fail the most basic litmus test (long prefixes that require too many keystrokes to tab-complete says a lot about how people think about usability - or don’t).
Again, systemd isn’t bad. But it certainly isn’t great.
As for blaming Python, well, blame the people who write software for _distribution_ in Python. Python isn’t a language that lends itself to writing software for distribution and the Python community isn’t the kind of community that will fix it.
Point out that it is problematic and you will be pointed to whatever mitigation that is popular at the time (to quote Queen “I've fallen in love for the first time. And this time I know it's for real”), and people will get upset with you, downvote you and call you names.
I’m too old to spend time on this so for me it is much easier to just ban Python from my projects. I’ve tried many times, I’ve been patient, and it always ends up biting me in the ass. Something more substantial has to happen before I’ll waste another minute on it.
> UNIX could have done with a re-thinking of how you deal with software, but that never happened.
I think it did, but the Unix world has an inherent bad case of "not invented here" syndrome, and a deep cultural reluctance to admit that other systems (OSes, languages, and more) do some things better.
NeXTstep fixed a big swath of issues (in the mid-to-late 1980s). It threw out X and replaced it with Display Postscript. It threw out some of the traditional filesystem layout and replaced it with `.app` bundles: every app in its own directory hierarchy, along with all its dependencies. Isolation and dependency packaging in one.
(NeXT realised this is important but it has to be readable and user-friendly. It replaces the traditional filesystem with something more readable. 15Y later, Nix realised the same lesson, but forgot the 2nd, so it throws out the traditional FHS and replaces it with something less readable, which needs software to manage it. The NeXT way means you can install an app with a single `cp` command or one drag-and-drop operation.)
Some of this filtered back upstream to Ritchie, Thompson and Pike, resulting in Plan 9: bin X, replace it with something simpler and filesystem-based. Virtualise the filesystem, so everything is in a container with a virtual filesystem.
But it wasn't Unixy enough so you couldn't move existing code to it. And it wasn't FOSS, and arrived at the same time as a just-barely-good-enough FOSS Unix for COTS hardware was coming: Linux on x86.
(The BSDs treated x86 as a 2nd class citizen, with grudging limited support and the traditional infighting.)
2 replies →
I agree with tou that the issue is packaging. And to have developers trying to package software is the issue IMO. They will come up with the most complicated build system to handle all scenarios, and the end result will be brittle and unwieldy.
There’s also the overly restrictive dependency list, because each deps in turn is happy to break its api every 6 months.
Exactly this, but not just Python. The traditional way most Linux apps work is that they are splayed over your filesystem with hard coded references to absolute paths and they expect you to provide all of their dependencies for them.
Basically the Linux world was actively designed to apps difficult to distribute.
It wasn't about making apps difficult to distribute at all, that's a later side effect. Originally distros were built around making a coherent unified system of package management that made it easier to manage a system due to everything being built on the same base. Back then Linux users were sysadmins and/or C programmers managing (very few) code dependencies via tarballs. With some CPAN around too.
For a sysadmin, distros like Debian were an innovative godsend for installing and patching stuff. Especially compared to the hell that was Windows server sysadmin back in the 90s.
The developer oriented language ecosystem dependency explosion was a more recent thing. When the core distros started, apps were distributed as tarballs of source code. The distros were the next step in distribution - hence the name.
Right but those things are not unrelated. Back in the day if you suggested to the average FOSS developer that maybe it should just be possible to download a zip of binaries, unzip it anywhere and run it with no extra effort (like on Windows), they would say that that is actively bad.
You should be installing it from a distro package!!
What about security updates of dependencies??
And so on. Docker basically overrules these impractical ideas.
22 replies →
> Basically the Linux world was actively designed to apps difficult to distribute.
It has "too many experts", meaning that everyone has too much decision making power to force their own tiny variations into existing tools. So you end up needing 5+ different Python versions spread all over the file system just to run basic programs.
It was more like, library writers forgot how to provide stable APIs for their software, and applications decided they just wanted to bundle all the dependencies they needed together and damn the consequences on the rest of the system. Hence we got static linked binaries and then containers.
even if you have a stable interface... the user might not want to install it and then forget to remove it down the line
Pretty much this; systems with coherent isolated dependency management, like Java, never required OS-level container solutions.
They did have what you could call userspace container management via application servers, though.
NodeJS, Ruby, etc also have this problem, as does Go with CGO. So the problem is the binary dependencies with C/C++ code and make, configure, autotools, etc... The whole C/C++ compilation story is such a mess that almost 5 decades ago inventing containers was pretty much the only sane way of tackling it.
Java at least uses binary dependencies very rarely, and they usually have the decency of bundling the compiled dependencies... But it seems Java and Go just saw the writing on the wall and mostly just reimplement everything. I did have problems with the Snappy compression in the Kafka libraries, though, for instance .
The issue is with cross platform package management without proper hooks for the platform themselves. That may be ok if the library is pure, but as soon as you have bindings to another ecosystem (C/C++ in most cases), then it should be user/configurable instead of the provider doing the configuration with post installs scripts and other hacky stuff.
If you look at most projects in the C world, they only provide the list of dependencies and some build config Makefile/Meson/Cmake/... But the latter is more of a sample and if your platform is not common or differs from the developer, you have the option to modify it (which is what most distros and port systems do).
But good luck doing that with the sprawling tree of modern packages managers. Where there's multiple copies of the same libraries inside the same project just because.
Funnily enough, one of the first containers I did on my current job was to package a legacy Java app.
It was pretty old, and required a very specific version of java, not available on modern systems. Plus some config files in global locations.
Packaging it in the docker container made it so much easier to use.
That makes sense, but still - there is nothing OS-specific here, like system lib or even a database, it's just the JRE version that you needed to package into the container, or am I missing something?
I don't agree with this. Java systems were one of the earliest beneficiaries of container-based systems, which essentially obsoleted those ridiculously over-complicated, and language-specific, application servers that you mentioned.
Java users largely didn't bother with containers IME, largely for the same reasons that most Java users didn't bother with application servers. Those who did want that functionality already had it available, making the move from an existing Java application server to Docker-style containers a minor upgrade at best.
1 reply →
Tomcat and Jetty are application servers which are in almost every Spring application. There are such application servers which you mentioned, like Wildfly, but they are not obsolete as a whole.
1 reply →
Not really, and apparently there is enough value to now having startups replicating application servers by running WebAssembly docker containers as Kubernetes pods.
2 replies →
Pyinstaller predates Docker. It's not about any individual language not being able to do packaging, it's about having a uniform interface for running applications in any language/architecture. That's why platforms like K8s don't have to know a thing about Python or anything else and they automatically support any future languages too.
Sure they definitely were using Docker for their own applications, but also dotCloud was itself a PaaS, so they were trying to compete with Heroku and similar offerings, which had buildpacks.
The problem is/was that buildpacks aren't as flexible and only work if the buildpack exists for your language/runtime/stack.