Comment by dajonker
1 day ago
> Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.
So well put, my good sir, this describes exactly my feelings with k8s. It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.
After spending a lot of time "optimizing" or "hardening" the cluster, cloud spend has doubled or tripled. Incidents have also doubled or tripled, as has downtime. Debugging effort has doubled or tripled as well.
I ended up saying goodbye to those devops folks, nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker. Despite having only a single VM rather than a cluster, things have never been more stable and reliable from an infrastructure point of view. Costs have plummeted as well, it's so much cheaper to run. It's also so much easier and more fun to debug.
And yes, a single VM really is fine, you can get REALLY big VMs which is fine for most business applications like we run. Most business applications only have hundreds to thousands of users. The cloud provider (Google in our case) manages hardware failures. In case we need to upgrade with downtime, we spin up a second VM next to it, provision it, and update the IP address in Cloudflare. Not even any need for a load balancer.
If you spin up Kubernetes for "a couple of containers to run your web app", I think you're doing something wrong in the first place, also coupled with your comment about adding SDN to Kubernetes.
People use Kubernetes for way too small things, and it sounds like you don't have the scale for actually running Kubernetes.
It depends what you're doing it.
My app is fairly simple node process with some side car worker processes. k8s enables me to deploy it 30 times for 30 PRs, trivially, in a standard way, with standard cleanup.
Can I do that without k8s? Yes. To the same standard with the same amount of effort? Probably not. Here, I'd argue the k8s APIs and interfaces are better than trying to do this on AWS ( or your preferred cloud provider ).
Where things get complicated is k8s itself is borderline cloud provider software. So teams who were previously good using a managed service are now owning more of the stack, and these random devops heros aren't necessarily making good decisions everywhere.
So you really have three obvious use cases:
a) You're doing something interesting with the k8s APIs, that aren't easy to do on a cloud provider. Essentially, you're a power user. b) You want a cloud abstraction layer because you're multi-cloud or you want a lock-in bargaining chip. c) You want cloud semantics without being on a cloud provider.
However, if you're a single developer with a single machine, or a very small team and you're happy working through contended static environments, you can pretty much just put a process on a box and call it done. k8s is overkill here, though not as much as people claim until the devops heros start their work.
Call me old fashion but I prefer tools like Dokploy that make deployment across different VPS extremely easy. Dokploy allows me to utilize my home media server, using local instances of forgejo to deploy code, to great effect.
k8s appears to be a corporate welfare jobs program where trillion dollar multinational monopolistic companies are the only ones who can collectively spend 100s of millions sustaining. Since most companies aren't trillion dollar monopolies, adopting such measures seems extremely poor.
All it signals to me is that we have to stop letting SV + VC dictate the direction of tech in our industry, because their solutions are unsustainable and borderline useless for the vast majority of use cases.
I'll never forget the insurance companies I worked at that orchestrated every single repo with a k8s deployment whose cloud spend was easily in the high six figures a month to handle a work load of 100k/MAU where the concurrent peak never went more than 5,000 users, something the company did know with 40 years of records. Literally had a 20 person team whose entire existence was managing the companies k8s setup. Only reason the company could sustain this was that it's an insurance company (insurance companies are highly profitable, don't let them convince you otherwise; so profitable that the government has to regulate how much profit they're legally allowed to make).
Absolute insanity, unsustainable, and a tremendous waste of limited human resources.
Glad you like it for your node app tho, happy for you.
17 replies →
> I'd argue the k8s APIs and interfaces are better than trying to do this on AWS
I think Amazon ECS is within striking distance, at least. It does less than K8S, but if it fits your needs, I find it an easier deployment target than K8S. There's just a lot less going on.
5 replies →
Totally, it's all about the primitives. I'm curious where exe.dev is gonna build on the the base, or just leave it up to folks to add all their own bespoke stuff to do containers, logs, etc.
The last 20 years has given us a lot of great primitives for folks to plug in, I think that lots of people don't want to wrangle those primitives, they just want to use them.
> a) You're doing something interesting with the k8s APIs, that aren't easy to do on a cloud provider. Essentially, you're a power user. b) You want a cloud abstraction layer because you're multi-cloud or you want a lock-in bargaining chip. c) You want cloud semantics without being on a cloud provider.
This is well put and it's very similar to the arguments made when comparing programming languages. At the end of the day you can accomplish the same tasks no matter which interface you choose.
Personally I've never found kubernetes that difficult to use[1]. It has some weird, unpredictable bits, but so does sysvinit or docker, that just ends up being whatever you're used to.
[1] except for having to install your own network mesh plugin. That part sucked.
[dead]
Depends. For personal projects, yeah definitely. But at work? Typically the “Platform” team can only afford to support 1 (maybe 2) ways of deployment, and k8s is quite versatile, so even if you need 1 small service, you’ll go with the self-service-k8s approach your Platform team offers. Because the alternative is for you (or your team) to own the whole infrastructure stack for your new deloyment model (ecs? lambda? Whatever): so you need to setup service accounts, secret paths, firewalls, security, pipelines, registries, and a large etc. And most likely, no one will give you access rights for all of that , and your PM won’t accept the overhead either.
So having everyone use the same deployment model (and that’s typically k8s) saves effort. I don’t like it for sure
This is where I'm at. Using Podman daily to run Python scripts and apps and it's been going great! However trying to build things like monitoring, secure secret injection, centralized inventory, remote logging, etc. has fallen on us. Has lead to some shadow IT (running our own container image registry, hashicorp vault instance, etc.) which makes me hesitant to share with others in the company how we're operating.
I like to think if we had a K8s environment a lot of this would be built out within it. Having that functionality abstracted away from the developer would be a huge win in my opinion.
2 replies →
I totally agree, but that's not what happens in reality: the average devops knows k8s and will slap it onto anything they see (if only so they can put in on their resume). The average manager hears about k8s, gets convinced they need and hires beforementioned devops to build it.
> the average devops knows k8s and will slap it onto anything they see
This is certainly the case from all the third person accounts I hear. Online. I never actually met a single one that is like that, if anything, those same people are the ones that are first to tell me about their Hetzner setups.
14 replies →
And the average developer doesn't even know where to start to deploy things in prod. When the feature product asks passes QA... to the next sprint! we are done!
6 replies →
> the average devops knows k8s
If you'd know Kubernetes, you know not to use it. I say that as someone who used to do consulting for it.
The reality is that yet again "making money" completely collides with efficient, quality, sane productive work.
For me one of the main reasons to leave that space is that I couldn't really deal with the fact that my work collides with a client's success. That said I have helped to get off that stuff and other things that they thought they needed, that just wasted time and money. It just feels odd going into a company that hired you to consult on a topic only to end up telling them "The best approach for you is not doing that at all". Often never. Like some people thought "Well, if we have hundreds of thousands or even millions of users" and the reality was that even in these scenarios if you went away from that abstract thought and discussed a hypothetical based on their product they realized that they'd still be better off without it. Besides the fact that this hypothetical often was in a future that made it likely that they said they'd likely have completely different setup so preparing for that didn't even make sense.
I think a big thing related to that was/is the microservice craze where people end up moving to a complex architecture for not many good reasons and then they increase complexity way faster than what they actually deliver in terms of the product, because it somehow feels good. I know it does, I've been there. When in reality the outcome often is just a complex mess with what could have been a relatively simple monolith. And these monoliths do work. And in the vast majority of cases they are easy to scale, because your problem switches from "how do we best allocate that huge amount of very different services across our infrastructure" to (for the most part) "how do we spin up our monolith on one more server" which tends to be a way easier to tackle service.
And nothing stops you from still using everything else if you want. Just because it's a monolith doesn't mean you need to skip on any of the cloud offerings, etc. For some reason there seems to be that idea that if you write a monolith you are somehow barred from using modern tooling, infrastructure, services, etc. Not sure where that comes from.
2 replies →
In some sense, Kubernetes is just a portable platform for running Linux services, even on a single node using something like K3s. I almost see it as being an extension of the Linux OS layer.
This is what I do for small stuff, debian vm, k3s on it for a nicer http based deployment api.
Then why can't we put a wrapper onto systemd and make that into a light weight k8s?
9 replies →
Yep, this is the way. Linux is just a platform for running services on one or more computers without needing to know about those computers individually, and even if your scale is 1, it's often easier to install k3s and manage your services with it rather than memorizing a bunch of disparate tools with their own configuration languages, filepath conventions, etc. It's just a lot easier to use k3s than it is to cobble together stuff with traditional linux tools. It's a standard, scalable pane of glass and as much as I may dislike kubectl, it's worlds better than systemctl and journalctl and the like.
I know that "resume-driven development" exists, where the tradeoffs between approaches aren't about the technical fit of the solution but the career trajectory. I've seen people making plain workstation preparation scripts using Rust, only to have something to flex about in interviews.
I'm not surprised even in the slightest that DevOps workers will slap k8s on everything, to show "real industry experience" in a job market where the resume matches the tools.
Your first example sound very sensible to me?
Using new technology in something small and unimportant like a setup script is a perfect way to experiment and learn. It would be irresponsible to build something important as the first thing you do in a new language.
2 replies →
We are building a religion, we are building it bigger We are widening the corridors and adding more lanes We are building a religion, a limited edition We are now accepting coders linking new AI brains
(Apologies to Cake. And coders.)
there are alsp people with devops title that do not know anything else than the hammer, and then everything is a hammer problem.
I mean, I worked with people who were suprised that you can run more applications inside ec2 vm than just 1 app.
2 replies →
> People use Kubernetes for way too small things, and it sounds like you don't have the scale for actually running Kubernetes.
This is a problem I've run into enterprise deployments. K8s is often the lowest common denominator semi small platform engineering teams arrive on. At my current employer, a platform managed K8s namespace is the only thing we got in terms of PaaS offering, so it is what we use. Is it overpowered? Yes. Is it overly complex for our usecase? Definitely. Could we basically get by hosting our services on a few cheap mini computers with no performance penalty? Also yes.
Doing Kubernetes like doing Agile is mandatory nowadays. I've been asked to package a 20 line worth of bash script as docker image so it can be delivered via CI/CD pipeline via Kubernetes pods in cloud.
Value is not that I got job done at a day's notice. It is black mark that I couldn't package it as per industry best practices.
Not doing would mean out of job/work. Whether it is happening correctly is not something decision makers care as long it is getting done anyhow.
In my 20+ years in the industry, I've been at one company which really did Agile, and that was the one I started with.
Everyone else is communicating they are doing Agile while being very far away from it ;)
2 replies →
It depends on your situation of course, but there are a lot of good reasons to package up that bash script and run it through the pipeline. If everyone does some backdoor deployment of their snowflake shell script that's not great. It doesn't matter if it's 20 lines or 2 lines.
There are many organizations which still ship software without Kubernetes. Perhaps even the vast majority.
1 reply →
I don't think there are any other industry best practices you could have followed.
That's basically why k8s is so compelling. It's tech is fine but it's a social technology that is known and can be rallied behind, that has consistent patterns that apply to anything you might dream of making "cloud native". What you did to get this script available for use will closely mirror how anyone else would also get any piece of software available.
Meanwhile conventional sys-op stuff was cobbling together "right sized" solutions that work well for the company, maybe. These threads are overrun with "you might not need k8s" and "use the solution that fits your needs", but man, I pity the companies doing their own frontiers-ing to explore their own bespoke "simple" paths.
I do think you are on to something with there not being food taste making, with not good oversight always.
We have a hobby web based app that consists of multiple containers. It runs in docker compose. Serves 1000 users right now (runs 24/7). Single VM.
No Kubernetes whatsoever.
I agree with you.
Docker compose is brilliant while your stack remains on a single box, and will scale quite nicely for some time this way for most applications with minimum maintenance overhead.
My personal strategy has always been to start off in docker compose, and break out to a k8s configuration later if I have to start scaling beyond single box.
> it sounds like you don't have the scale for actually running Kubernetes.
You don't set up k8s because your current load can't be handled, you do for future growth. Sometimes that growth doesn't pan out and now you're left with a complex infrastructure that is expensive to maintain and not getting any of the benefit.
k8s is useful when you have services that must spin up and down together, and you want to swap out services and deploy all/some/one.
and then also package this so that you and other developers can get the infrastructure running locally or on other machines.
They use it for inflating their resume for career progression rather than actually evaluating if they need it in the first place.
This is why you get many folks over-thinking the solution and picking the most hyped technologies and using them to solve the wrong problems without thinking about what they are selling.
You don't need K8s + AWS EC2 + S3 just to host a web app. That tells me they like lighting money on fire and bankrupting the company and moving to the next one.
Often the alternatives presented as cheaper to me in discussions are actually burning money.
But given how I always see "you don't need k8s because you're not going to scale so fast" I am feel like even professional k8s operators have missed the fundamental design goals of it :/ (maximizing utilization of finite compute)
Even if using just one VM, I'll probably slap k3s on it and manage my application using manifests. It's just so much easier than dealing with puppet or chef or vanilla cloud-init. Docker compose works too, but at that point it's just easier to stick with k3s and then I can have nice things like background jobs, a straightforward path to HA, access to an ecosystem of existing software, and a nicer CLI.
Thats what I don't get when people bring up this idea k8s is complicated.
All of those other tools are complicated and fragile
1 reply →
yeah it's like wanting to drive to the mall in the Space Shuttle and then complaining how its too complicated
The problem with Kubernetes is that it doesn't scale down to small deployments very well, but it sure as shit doesn't scale up to large ones either. Large shared multi-tenant clusters have massive problems even when running parts of the same application with the same incentives, it falls apart completely when the tenants are diverse.
Nomad has neither of these problems.
I have nom doubt that there are legit use cases for something like k8s at Google or other multi-billion companies.
But if its use was confined to this use case, pretty much nobody would be using it (unless as a customer of the organization's infra) and barely would be talking about it (like how there isn't too much talk about Borg).
The reason k8s is a thing in the first place is because it's being used by way too many people for their own goods. (Most people having worked in startups have met too many architecture astronauts in our lives).
If I had to bet, I'd wager that 99% of k8s users are in the “spin a few containers to run your web app” category (for the simple reason that for one billion-dollar tech business using it for legit reasons, there's many thousands early startups who do not).
The legit use case for companies like Google/Amazon etc is only to sell it to customers. None of these companies use K8s internally for real critical workloads.
9 replies →
And those devops folks just let your single debian VM be? It sounds like you have, like many of us, an organizational/people problem, not a k8s problem.
Maybe those devops folks only pay attention to k8s clusters and you're flying under their radar with your single debian VM + Kamal. But the same thinking that results in an overtly complex, impossible to debug, expensive to run k8s cluster can absolutely result in the same using regular VMs unless, again, you are just left to your own devices because their policies don't apply to VMs, yet.
The problem usually is you're one mistake away from someone shoving their nose in it. "What are you doing again? What about HA and redundancy? slow rollout and rollback? You must have at least 3 VMs (ideally 5) and can't expose all VMs to the internet of course. You must define a virtual network with policies that we can control and no wireguard isn't approved. You must split the internet facing load balancer from the backend resources and assign different identities with proper scoping to them. Install these 4 different security scanners, these 2 log processors, this watchdog and this network monitor. Are you doing mtls between the VMs on the private network? what if there is an attacker that gains access to your network? What if your proxy is compromised? do you have visibility into all traffic on the network? everything must flow throw this appliance"
I mean, it's pretty clear the only reason they even got to swap to a single VM and take the glory is because they fired the devops in question. As in, they're the actual boss of a small operation. That's what saying goodbye and nuking the cluster implies here.
A single VM is indeed the most pragmatic setup that most apps really need. However I still prefer to have at least two for little redundancy and peace of mind. It’s just less stressful to do any upgrades or changes knowing there is another replica in case of a failure.
And I’m building and happily using Uncloud (https://github.com/psviderski/uncloud) for this (inspired by Kamal). It makes multi-machine setups as simple as a single VM. Creates a zero-config WireGuard overlay network and uses the standard Docker Compose spec to deploy to multiple VMs. There is no orchestrator or control plane complexity. Start with one VM, then add another when needed, can even mix cloud VMs and on-prem.
People have it backwards.
If you have an app and you want to run a single app yeah silly to look for K8s.
If you have a beefy server or two you want to utilize fully and put as many apps on it without clashing dependencies you want to use K8s or docker or other containers. Where K8s enables you to go further.
Why would you want to use K8s for one or two beefy servers? It's designed for solving a different problem at a large scale.
3 replies →
That looks pretty interesting. Is it being used in production yet (I mean serious installs) ?
Yes but at small scale. Myself and a handful of others from our Discord run it in production. The core build/push/deploy workflows are stable and most of the heavy lifting at runtime is done by battle-tested projects: Docker, Caddy, WireGuard, Corrosion from Fly.io.
Radboud University recently announced they're rolling it out for managing containers across the faculty which is the most "serious install" I know about, but there could be other: https://cncz.science.ru.nl/en/news/2026-04-15_uncloud/
4 replies →
this is dope work.
I don't get it, I think that k8s is the best software written since win95. It redefines computing in the same way IMHO. I have some experience in working with k8s on prod and I loved every moment of it. I'm definitely missing something.
Took a while to find this. K8s is great, IMO most of the people with alternative setups are just rebuilding (usually worse) or compressing (specific to their use case) k8s features that have been GA for a long time.
Spend some time learning it, using it to deploy simple apps, and you won't go back to deploying in a VM again imo.
This only gets better with ai-assisted development, any model is going to produce much better results for k8s given the huge training set vs someone's bespoke build rube-goldberg machine.
I deploy prod by running a shell script I wrote that rsyncs the latest version of the codebase to my server, then sshs into the server and restarts the relevant services
how could k8s improve my deployment process?
2 replies →
I think it's just that k8s allows you to shoot yourself in the foot, thus it gets all the blame.
when in reality, you can go very bare-bones with k8s, but people pretend like only the most extreme complexity is what's possible because it's not easy to admit that k8s is actually quite practical in a lot of ways, especially for avoiding drift and automation
that's my take on it
Can you expand how it redefined computing for you personally?
it's always a skill issue when it comes to people complaining with k8s
knowing when and when not to use k8s, is also a skill
Missing some hn snobbery
I noticed in his article he said something like 'and then devops team puts a ton of complexity...' which doesnt seem like a k8s problem.
You're not missing anything. There's legions of amateurs that dislike k8s because they don't understand the value.
> the best software written since win95
This feels like what us Brits would call "damning with faint praise".
Windows 95 was terrible. Really bad. If you really mean to say that Kubernetes is revolutionary and well-engineered, Windows 2000 would be a much better example.
it sold like 7 mil copies in a month. yes 98 was much more polished overall but 95 revolutionized personal computing as it was much more accessible than NeXT stuff
I thought we collectively learned this with stack overflows engineering blog years ago.
Scale vertically until you can't because you're unlikely to hit a limit and if you do you'll have enough money to pay someone else to solve it.
Docker is amazing development tooling but it makes for horrible production infrastructure.
Docker is great development tooling (still some rough edges, of course).
Docker Compose is good for running things on a single server as well.
Docker Swarm and Hashicorp Nomad are good for multi-server setups.
Kubernetes is... enterprise and I guess there's a scale where it makes sense. K3s and similar sort of fill the gap, but I guess it's a matter of what you know and prefer at that point.
Throw on Portainer on a server and the DX is pretty casual (when it works and doesn't have weird networking issues).
Of course, there's also other options for OCI containers, like Podman.
> Docker Swarm
IS that a thing still?
> Kubernetes is... enterprise
I would contest that. Its complex, but not enterprise.
Nomad is a great tool for running processes on things. The problem is attaching loadbalancers/reverse proxies to those processes requires engineering. It comes for "free" with k8s with ingress controllers.
3 replies →
> Docker is great development tooling (still some rough edges, of course).
Show me a Docker in use where build caching was solved optimally for development builds (like eg. make did for C 40 or 50 years ago)?
Perhaps you consider Docker layers one of the "rough edges", but I believe instant, iterative development builds are a minimum required for "great development tooling".
I did have great fun optimizing Docker build times, but more in the "it's a great engineering challenge to make this shitty thing build fast" sense.
1 reply →
This is why there's an endless cycle of shitty SaaS with slow APIs and high downtime. People keep thinking that scale is something you can just add later.
What's a more reasonable general approach then?
Let's say you're a team of 1-3 technical people building something as an MVP, but don't necessarily want to throw everything away and rewrite or re-architect if it gets traction.
What are your day 1 decisions that let you scale later without over-engineering early?
I'm not disagreeing with you btw. I genuinely don't know a "right" answer here.
1 reply →
I'd argue on the contrary that it's the last decades' over-engineering bender that's coming home to roost. Now too many things have too many moving parts to keep stable.
Clearly, Kubernetes wasn’t the right solution for your case, and I also agree that using it for smaller architectures is overkill. That said, it’s the standard for large-scale production platforms that need reproducibility and high availability. As of today I don’t see many *truly* viable alternatives and honestly I haven't even seen them.
I dunno the more people dig into this approach they will probably end up just reinventing Kubernetes.
I use k3s/Rancher with Ansible and use dedicated VMs on various providers. Using Flannel with wireguard connects them all together.
This I think is reasonable solution as the main problem with cloud providers is they are just price gouging.
I always feel like I am taking crazy pills when I read these threads. The k8s API and manifests config feels like a create standardardized way to deploy containers. I wouldn't want to run a k8s cluster from scratch but EKS has been pretty straightforward to work with. Being able to use kind locally for testing is amazing and k9s is my new favourite infra monitoring tool.
Even if you just run on 2 nodes with k3s it seems worth it to me for the standardized tooling. Yes, it is not a $5 a month setup but frankly if what you host can be served by a single $5 a month VM I don't particularly care about your insights, they are irrelevant in a work context.
Yes, I mean, I’m an engineer on a cloud Kubernetes service, and I don’t run Kubernetes for my home services. I just run podman quadlets (systems units). But that is entirely different from an enterprise scale setup with monitoring, alerting, and scale in mind…
Similar deal here. My $dayjob title is "Cloud Engineer" and I spend a lot of my time working with AKS and Istio. But for some recent personal projects at home, I've just been running Docker Swarm on a single server. It's just lighter and less complicated, and for what I'm doing it more than satisfies my needs. Now if this was going to production at mass scale, I might consider switching to K8S, but for experimentation and initial development, it would be way overkill.
> But that is entirely different from an enterprise scale setup with monitoring, alerting, and scale in mind
Do you have experience with Kubernetes solving these issues? Would love to hear more if so.
Currently running podman containers at work and trying to figure out better solutions for monitoring, alerting, etc. Not so worried about scale (my simple python scripts don't need it) but abstracting away the monitoring, alerting, secure secret injection, etc. seems like it'd be a huge win.
Kubernetes offers powerful low-level primitives that can support virtually any deployment architecture. However, working with these primitives directly requires significant YAML wrangling. It makes sense to build specialized solutions on top of Kubernetes that simplify common deployment patterns. Knative is one such solution. Any solution that tries to expose all underlying primitives will inevitably become as complex as Kubernetes itself.
I have been building https://github.com/openrundev/openrun, which provides a declarative solution to deploy internal web apps for teams (with SAML/OAuth and RBAC). OpenRun runs on a single-machine with Docker or it can deploy apps to Kubernetes.
As the strongest engineer I ever worked with commented: "Across multiple FAANG-adjacent companies, I've never seen a k8s migration go well and not require a complete reimplementation of k8s behind the APIs."
Is that because kubernetes was the right fit from the beginning, or because the initial implementation was designed around kubernetes, which caused the migration to eventually end up taking that same shape?
Cloud providers have put a lot of time and effort into making you believe every web app needs 99.9999% availability. Making you pay for auto scaled compute, load balancers, shared storage, HA databases, etc, etc.
All of this just adds so much extra complexity. If I'm running Amazon.com then sure, but your average app is just fine on a single VM.
And funnily recently many of the Big Serious Cloud Websites are shitting the bed of availability aggressively.
Marketing has such a gigantic influence in our field. It is absolutely insane. It feels unavoidable, since IT is (was?) constantly filled with new blood that picks up where people left off.
That thought crossed my mind recently as well. Not to mention the huge software stacks and the potential supply chain vulnerabilities that entails.
I have designed a backend with exactly the same underlying philosophy as you ended up: load balancer? Oh, a problem. So better client-side hashing and get rid of a discovery service via a couple dns tricks already handled elsewhere robustly.
I took it to its maximum: every service is a piece that can break ---> fewer pieces, fewer potential breakages.
When I can (which is 95% of the time, I add certain other services inside the processed themselves inside the own server exes and make them activatable at startup (though I want all my infra not to drift so I use the same set of subservices in each).
But the idea is -- the fewer services, the fewer problems. I just think, even with the trade-offs, it is operationally much more manageable and robust in the end.
The problem is not Kubernetes but how it's treated. From its inception I've been seeing two anti-patterns: treating it as a platform (and being frustrated for Kubernetes not meeting expectations) and treating it as a product or part of a product (investing heavily into its customization and making it a dependency). Neither is practical unless you are building a platform and it is your product. Otherwise it should be viewed as an OS and treated as a commodity. You create a single big VM with MicroK8s per project (zero-ops vanilla Kubernetes) and make no dependency on how exactly Kubernetes is setup. This way you can run the same setup locally and in a data center. If ever needed your app could be moved to any cloud as long as that cloud meets basic prerequisites (like presence of persistent storage or load balancer). The best part is Kubernetes (unlike traditional OS) is API driven and your apps could be nicely packaged and managed using Terraform/OpenTofu or similar tooling.
At a previous job, our build pipeline
* Built the app (into a self contained .jar, it was a JVM shop)
* Put the app into a Ubuntu Docker image. This step was arguably unnecessary, but the same way Maven is used to isolate JVM dependencies ("it works on my machine"), the purpose of the Docker image was to isolate dependencies on the OS environment.
* Put the Docker image onto an AWS .ami that only had Docker on it, and the sole purpose of which was to run the Docker image.
* Combined the AWS .ami with an appropriately sized EC2.
* Spun up the EC2s and flipped the AWS ELBs to point to the new ones, blue green style.
The beauty of this was the stupidly simple process and complete isolation of all the apps. No cluster that ran multiple diverse CPU and memory requirement apps simultaneously. No K8s complexity. Still had all the horizontal scaling benefits etc.
Well, you used a tank to plow a field then complained about maintenance and fuel usage.
If you have actual need to deploy few dozen services all talking with eachother k8s isn't bad way to do it, it has its problems but it allows your devs to mostly self-service their infrastructure needs vs having to process ticket for each vm and firewall rules they need. That is saying from perspective of migrating from "old way" to 14 node actual hardware k8s cluster.
It does make debugging harder as you pretty much need central logging solution, but at that scale you want central logging solution anyway so it isn't big jump, and developers like it.
Main problem with k8s is frankly nothing technical, just the "ooh shiny" problem developers have where they see tech and want to use tech regardless of anything
I started using GKE at a seed stage company in 2017. It's still going fine today. I had zero ops experience and I found it rather intuitive. We brought in istio for mtls and outbound traffic policies and that worked pretty well too. I can only remember one fairly stressful outage caused by the control plane but it ended up remedying itself. I would certainly only use a managed k8s.
So I guess I'm a fan. I use a monolith for most of my stuff if I have the choice, but if I'm working somewhere or on something where I have to manage a bunch of services I'm most certainly going to reach for k8s.
Not advocating for complexity or k8s, but if your workflow can be served by a single VM, then you are magnitudes away from the volume and complexity that would push you to have k8s setup and there is even no debate of it.
There are situations where a single VM, no matter how powerful is, can do the job.
>> Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.
I don't work that closely with k8s, but have toyed with a cluster in my homelab, etc. Way back before it really got going, I observed some OpenStack folks make the jump to k8s.
Knowing what I knew about OpenStack, that gave me an inkling that what you describe would happen and we'd end up in this place where a reasonable thing exists but it has all of this crud layered on top. There are places where k8s makes sense and works well, but the people surrounding any project are the most important factor in the end result.
Today we have an industry around k8s. It keeps a lot of people busy and employed. These same folks will repeat k8s the next time, so the best thing people that who feel they have superior taste is to press forward with their own ideas as the behavior won't change.
> I ended up saying goodbye to those devops folks,
The irony is that "DevOps" was supposed to be a culture and a set of practices, not a job title. The tools that came with it (=Kubernetes) turned out to be so complex that most developers didn't want to deal with them and the DevOps became a siloed role that the movement was trying to eliminate.
That's why I have an ick when someone uses devops as a job title. Just say "System Admin" or "Infrastrcutre Engineer". Admit that you failed to eliminate the siloes.
Yep, "Cloud Infrastructure Engineer" is what I prefer.
I am primarily a backend developer but I do a lot of ops / infra work because nobody else wants to do it. I stay as far away from k8s as possible.
> Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.
I'm not familiar with kubernetes, but doesn't it already do SDN out of the box?
> doesn't it already do SDN out of the box
Yes and no. Kubernetes defines specification about network behavior (in form of CNI), but it contains no actual implementation. You have to install the network plugin basically as the first setup step.
I always looked into k8s and then realized it solves YouTube-scale problems, which I don’t have.
> nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker.
Absolutely brilliant. Love it.
And if you need a cluster, Hashicorp Nomad seems like a more reasonable option than full blown kubernetes. I've never actually used it in prod, only a lab, but I enjoyed it.
We run nomad at work. I’m very happy with it from an administrative standpoint.
That is good but at bigger orgs with massive workloads and the teams to build it out k8s makes sense. It is a standard and brilliant tech.
We've reduced our costs on Hetzner to about 10% on what we've paid on Heroku, for 10x performance. Kamal really kicks ass, and you can have a pretty complicated infrastructure up in no time. We're using terraform, ansible + kamal for deploys, no issues whatsoever.
Can you elaborate a bit on what terraform and mandible are doing for you in your setup?
We've configured our Hetzner servers with terraform, so we can easily spin up a new one in case we notice that we need another slave to handle extra work (1-2 mins). Ansible is responsible for configuring the server, installing all the required packages and software (not all our infrastructure is deployed with Kamal, for instance we have clickhouse instances, DBs, redis etc and normal app slaves). TLDR; it helps us have a new instance up an runing in minutes, or recreating our infrastructure for a new client environment
1 reply →
So... if you're at the point where you're using a single VM, I have to ask why bother with docker at all? You're paying a context switch overhead, memory overhead, and disk overhead that you do not need to. Just make an image of the VM in case you need to drop it behind an LB.
There's one extra process that takes up a tiny bit of CPU and memory. For that, you get an immutable host, simple configuration, a minimal SBOM, a distributable set of your dependencies, x-platform for dev, etc.
Yes but NixOS does all of these things already, without the process overhead
4 replies →
How is docker a context switch overhead? It's the same processes running on the same kernel.
You're adding all of the other supporting processes within the container that needn't be replicated.
2 replies →
If you've ever had the displeasure of seeing the sorry state of VM tooling you would have known that building custom VM images is a very complicated endeavour compared to podman build or docker build.
I once tried to build a simple setup using VM images and the complexity exploded to the point where I'm not sure why anyone should bother.
When building a container you can just throw everything into it and keep the mess isolated from other containers. If you use a VM, you can't use the OCI format, you need to build custom packages for the OS in question. The easiest way to build a custom package is to use docker. After that you need to build the VM images which requires a convoluted QEMU and libvirt setup and a distro specific script and a way to integrate your custom packages. Then after all of this is done you still need to test it, which means you need to have a VM and you need to make it set itself up upon booting, meaning you need to learn how to use cloud-init.
Just because something is "mature" doesn't mean it is usable.
The overhead of docker is basically insignificant and imperceptible (especially if you use host networking) compared to the day to day annoyances you've invited into your life by using VM images. Starting a a VM for testing purposes is much slower than starting a container.
This comment chain is probably talking about like aws images, amis, which is just an api call and it snapshots the vm for you. Or use packer
I'm very happy with my k8s setup for my small startup. I believe it would have been much harder for me to get it off the ground, manage it etc. without it.
First time I’ve heard of Kamal. Looks ideal!
Do you pair it with some orchestration (to spin up the necessary VM)?
Yes, I've had similar experiences. My life has been much easier since I migrated to ECS Fargate - the service just works great. No more 2AM calls (at least not because of infra incidents), no more cost concerns from my boss.
DevOps lost the plot with the Operator model. When it was being widely introduced as THE pattern I was dismayed. These operators abstract entirely complex services like databases behind yaml and custom go services. When going to kubecon i had one guy tell me he collects operators like candy. Answers on Lifecycle management, and inevitable large architectural changes in an ever changing operator landscape was handwaved away with series of staging and development clusters. This adds so much cost.. Fundamentally the issue is the abstractions being too much and entirely on the DevOps side of the "shared responsibility model". Taking an RDBMS from AWS of Azure is so vastly superior to taking all that responsibility yourself in the cluster.. Meanwhile (being a bit of an infrastructure snob) I run Nixos with systemd oci containers at home. With AI this is the easiest to maintain ever.
Those managed databases from the big cloud providers have even more machinery and operator patterns behind them to keep them up and running. The fact that it's hidden away is what you like. So the comparison makes no sense.
I think this comment and replies capture the problem with Kubernetes. Nobody gets fired for choosing Kubernetes now.
It's obvious to you, me and the other 2 presumably techie people who've responded within 15 mins that you shouldn't have been using Kubernetes. But you probably work in a company of full of techie people, who ended up using Kubernetes.
We have HN, an environment full of techie people here who immediately recognise not to use k8s in 99% of cases, yet in actually paid professional environments, in 99% of cases, the same techie people will tolerate, support and converge on the idea they should use k8s.
I feel like there's an element of the emperors new clothes here.
What scale is this story operating at? My experience managing a fleet of services is that my job would take 10x as long without k8s. It's hard, not bad.
Kubernetes is not bad, it's just low level. Most applications share the exact same needs (proof: you could run any web app on a simple platform like Heroku). That's why some years ago I built an open source tool (with 0 dependencies) that simplify Kubernetes deployments with a compact syntax which works well for 99% of web apps (instead of allowing any configuration, it makes many "opinionated" choices): https://github.com/cuber-cloud/cuber-gem I have been using it for all the company web apps and web services for years and everything works nicely. It can also auto scale easily and that allows us to manage huge spikes of traffic for web push (Pushpad) at a reasonable price (good luck if you used a VM - no scaling - or if you used a PaaS - very high costs).
It's not just low level, in most cases, it's also overkill.
Most companies aren't "web scale" ™ and don't need an orchestrator built for google level elasticity, they need a vm autoscaling group if anything.
Most apps don't need such granular control over fs access, network policies, root access, etc, they need `ufw allow 80 && ufw enable`
Most apps don't need a 15 stage, docker layer caching optimized, archive promotion build pipeline that takes 30 minutes to get a copy change shipped to prod, they need a `git clone me@github.com:me/mine.git release_01 && ln -s release_01 /var/www/me/mine/current`
This is coming from someone who has had roles both as a backend product engineer and as a devops/platform engineer, who has been around long enough to remember "deploy" to prod was eclipse ftping php files straight to the prod server on file save. I manage clusters for a living for companies that went full k8s and never should have gone full k8s. ECS would have worked for 99% of these apps, if they even needed that.
Just like the js ecosystem went bat shit insane until things started to swing back towards sanity and people started to trim the needless bloat, the same is coming or due for the overcomplexity of devops/backend deployments
If this works `git clone me@github.com:me/mine.git release_01 && ln -s release_01 /var/www/me/mine/current` then your Docker builds should also be extremely quick. Where I have seen extremely slow docker builds is with Python services using ML libraries. But those I reallly don't want to be building on the production servers.
"ECS would have worked for 99% of these apps, if they even needed that."
I used to agree with that but is EKS really that much more complicated? Yes you pay for the k8s control plane but you gain tooling that is imho much easier to work with than IaC.
1 reply →
If you replaced k8s with a single app on a single VM then you’ve taken a hype fuelled circuitous route to where you should have been anyway.
Not so surprised that the architecture approach pushed by cloud vendors are... increasing cloud spend!
My first and really only experience with Kubernetes was a project I did about six years ago. I was tasked with building a thing that did some lightly distributed compute using Python + Dask. I was able to cobble together a functioning (internal) product, and we went to production.
Not long after, I found that the pods were CONSTANTLY getting into some weird state where K8s couldn't rebuild, so I had to forcibly delete the pods and rebuild. I blamed myself, not knowing much about K8s, but it also was extremely frustrating because, as I understood/understand it, the entire purpose of Kubernetes is to ensure a reliable deployment of some combination of pods. If it couldn't do that and instead I had to manually rebuild my cluster, then what was the point?
In the end, I ended up nuking the entire project -- K8s, Docker containers, Python, and Dask -- and instead went with a single Rust binary deployed to an Azure Function. The result was faster (by probably an order of magnitude), less memory, cheaper (maybe -80% cost), and much more reliable (I think around four nines).
Your use case is very small and simple. Of course a single VM works. You’re changing a literal A record at CF to deploy confirms this.
That is not what kube is designed for.
microVM's are going to make all of these redundant
> and an entire software-defined networking layer on top of it.
This is one of the main fuckups of k8s, the networking is batshit.
The other problems is that secrets management is still an afterthought.
The thing that really winds me up is that it doesn't even scale up that much. 2k nodes and it starts to really fall apart.
This feels like the microservices versus monolith problem. You can use cloud services or not, and that's orthogonal to running your app in Kubernetes or in a VM.
Similarly, I suspect (based on your "hardening" grievance) that a lot of your tedium is just that cloud APIs generally push you toward least-privileges with IAM, which is tedious but more secure. And if you implement a comparably secure system on your single VM (isolating different processes and ensuring they each have minimal permissions, firewall rules, etc) then you will probably have strictly more incidents and debugging effort. But you could go the other way and make a god role for all of your services to share and you will spend much less time debugging or dealing with incidents.
Even with a single VM, you could throw k3s on it and get many of the benefits of Kubernetes (a single, unified, standardized, extensible control plane that lots of software already supports) rather than having to memorize dozens of different CLI utilities, their configuration file formats, their path preferences, their logging locations, etc. And as a nice bonus, you have a pretty easy path toward high availability if you decide you ever want your software to run when Google decides to upgrade the underlying hardware.
There exists a sweet spot between docker swarm and docker, not quite portainer, but a bit more.
The tools in this space can really help get a few containers in dev/staging/production much more manageable.
> It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.
As a devops/cloud engineer coming from a pure sysadmin background (you've got a cluster of n machines running RHEL and that's it) i feel this.
The issues i see however are of different nature:
1. resumeè-driven development (people get higher-paying job if you have the buzzwords in your cv)
2. a general lack of core-linux skills. people don't actually understand how linux and kubernetes work, so they can't build the things they need, so they install off-the-shelf products that do 1000 things including the single one they need.
3. marketing, trendy stuff and FOMO... that tell you that you absolutely can't live without product X or that you must absolutely be doing Y
to give you an example of 3: fluxcd/argocd. they're large and clunky, and we're getting pushed to adopt that for managing the services that we run inside the cluster (not developer workloads, but mostly-static stuff like the LGTM stack and a few more things - core services, basically). they're messy, they add another layer of complexity, other software to run and troubleshoot, more cognitive load.
i'm pushing back on that, and frankly for our needs i'm fairly sure we're better off using terraform to manage kubernetes stuff via the kubernetes and helm provider. i've done some tests and frankly it works beautifully.
it's also the same tool we use to manage infrastructure, so we get to reuse a lot of skills we already have.
also it's fairly easy to inspect... I'm doing some tests using https://pkg.go.dev/github.com/hashicorp/hcl/v2/hclparse and i'm building some internal tooling to do static analysis of our terraform code and automated refactoring.
i still think kubernetes is worth the hassle, though (i mostly run EKS, which by the way has been working very good for me)
And nowadays with Claude you can spin up clusters of vps machines in a few hours. All bare Debian without anything except nginx and the apps. Mass configuring without any tools using only Claude. Works perfectly. The costs saved without all the overhead is massive.
[dead]
[dead]