Comment by weitendorf
4 hours ago
Having worked on Cloud Run/Cloud Functions, I think almost every company that isn't itself a cloud provider could be in category 1, with moderately more featureful implementations that actually competed with K8s.
Kubernetes is a huge problem, it's IMO a shitty prototype that industry ran away with (because Google tried to throw a wrench at Docker/AWS when Containers and Cloud were the hot new things, pretending Kubernetes is basically the same as Borg), then the community calcified around the prototype state and bought all this SAAS/structured their production environments around it, and now all these SAAS providers and Platform Engineers/Devops people who make a living off of milking money out of Kubernetes users are guarding their gold mines.
Part of the K8s marketing push was rebranding Infrastructure Engineering = building atop Kubernetes (vs operating at the layers at and beneath it), and K8s leaks abstractions/exposes an enormous configuration surface area, so you just get K8s But More Configuration/Leaks. Also, You Need A Platform, so do Platform Engineering too, for your totally unique use case of connecting git to CI to slackbot/email/2FA to our release scripts.
At my new company we're working on fixing this but it'll probably be 1-2 more years until we can open source it (mostly because it's not generalized enough yet and I don't want to make the same mistake as Kubernetes. But we will open source it). The problem is mostly multitenancy, better primitives, modeling the whole user story in the platform itself, and getting rid of false dichotomies/bad abstractions regarding scaling and state (including the entire control plane). Also, more official tooling and you have to put on a dunce cap if YAML gets within 2 network hopes of any zone.
In your example, I think
1. you shouldn't have to think about scaling and provisioning at this level of granularity, it should always be at the multitenant zonal level, this is one of the cardinal sins Kubernetes made that Borg handled much better
2. YAML is indeed garbage but availability reporting and alerting need better official support, it doesn't make sense for every ecommerce shop and bank to building this stuff
3. a huge amount of alerts and configs could actually be expressed in business logic if cloud platforms exposed synchronous/real-time billing with the scaling speed of Cloud Run.
If you think about it, so so so many problems devops teams deal with are literally just
1. We need to be able to handle scaling events
2. We need to control costs
3. Sometimes these conflict and we struggle to translate between the two.
4. Nobody lets me set hard billing limits/enforcement at the platform level.
(I implemented enforcement for something close to this for Run/Appengine/Functions, it truly is a very difficult problem, but I do think it's possible. Real time usage->billing->balance debits was one of the first things we implemented on our platform).
5. For some reason scaling and provisioning are different things (partly because the cloud provider is slow, partly because Kubernetes is single-tenant)
6. Our ops team's job is to translate between business logic and resource logic, and half our alerts are basically asking a human to manually make some cost/scaling analysis or tradeoff, because we can't automate that, because the underlying resource model/platform makes it impossible.
You gotta go under the hood to fix this stuff.
Since you are developing in this domain. Our challenge with both lambdas and cloud run type managed solutions is that they seem incompatible with our service mesh. Cloud run and lambdas can not be incorporated with gcp service mesh, but only if it is managed through gcp as well. Anything custom is out of the question. Since we require end to end mTLS in our setup we cannot use cloud run.
To me this shows that cloud run is more of an end product than a building block and it hinders the adoption as basically we need to replicate most of cloud run ourselves just to add that tiny bit of also running our Sidecar.
How do you see this going in your new solution?
Lots to unpack here.
I will just say based on recent experience the fix is not Kubernetes bad it’s Kubernetes is not a product platform; it’s a substrate, and most orgs actually want a platform.
We recently ripped out a barebones Kubernetes product (like Rancher but not Rancher). It was hosting a lot of our software development apps like GitLab, Nexus, KeyCloak, etc
But in order to run those things, you have to build an entire platform and wire it all together. This is on premises running on vxRail.
We ended up discovering that our company had an internal software development platform based on EKS-A and it comes with auto installers with all the apps and includes ArgoCD to maintain state and orchestrate new deployments.
The previous team did a shitty job DIY-ing the prior platform. So we switched to something more maintainable.
If someone made a product like that then I am sure a lot of people would buy it.
There are plenty of PaaS components that run on k8s if you want to use them. I'm not a fan, because I think giving developers direct access to k8s is the better pattern.
Managed k8s services like EKS have been super reliable the last few years.
YAML is fine, it's just configuration language.
> you shouldn't have to think about scaling and provisioning at this level of granularity, it should always be at the multitenant zonal level, this is one of the cardinal sins Kubernetes made that Borg handled much better
I'm not sure what you mean here. Manage k8s services, and even k8s clusters you deploy yourself, can autoscale across AZ's. This has been a feature for many years now. You just set a topology key on your pod template spec, your pods will spread across the AZ's, easy.
Most tasks you would want to do to deploy an application, there's an out of the box solution for k8s that already exists. There have been millions of labor-hours poured into k8s as a platform, unless you have some extremely niche use case, you are wasting your time building an alternative.
Every time I’ve pushed for cloud run at jobs that were on or leaning towards k8s I was looked at as a very unserious person. Like you can’t be a “real” engineer if you’re not battling yaml configs and argoCD all day (and all night).
It does have real tradeoffs/flaws/limitations, chief among them, Run isn't allowed to "become" Kubernetes, you're expected to "graduate". There's been an immense marketing push for Kubernetes and Platform Engineering and all the associated SAAS sending the same message (also, notice how much less praise you hear about it now that the marketing has died down?).
The incentives are just really messed up all around. Think about all the actual people working in devops who have their careers/job tied to Kubernetes, and how many developers get drawn in by the allure and marketing because it lets them work on more fun problems than their actual job, and all the provisioned instances and vendor software and certs and conferences, and all the money that represents.