← Back to context

Comment by kevmo314

8 hours ago

The wildest part is they’ll take those massive machines, shard them into tiny Kubernetes pods, and then engineer something that “scales horizontally” with the number of pods.

Yeah man, you're running on a multitasking OS. Just let the scheduler do the thing.

  • Yeah this. As I explain many times to people, processes are the only virtualisation you need if you aren’t running a fucked up pile of shit.

    The problem we have is fucked up piles of shit not that we don’t have kubernetes and don’t have containers.

    • Containers are just processes plus some namespacing, nothing really stops you from running very huge tasks on Kubernetes nodes. I think the argument for containers and Kubernetes is pretty good owing to their operational advantages (OCI images for distributing software, distributed cron jobs in Kubernetes, observability tools like Falco, and so forth).

      So I totally understand why people preemptively choose Kubernetes before they are scaling to the point where having a distributed scheduler is strictly necessary. Hadoop, on the other hand, you're definitely paying a large upfront cost for scalability you very much might not need.

      1 reply →

    • Hahhah, yuuuup.

      I can maybe make a case for running in containers if you need some specific security properties but .. mostly I think the proliferation of 'fucked up piles of shit' is the problem.

    • Disagree.

      Different processes can need different environments.

      I advocate for something lightweight like FreeBSD jails.

  • Its all fun and games, until the control plane gets killed by the OOMkiller.

    Naturally, that detaches all your containers. And theres no seamless reattach for control plane restart.

    • Or your CNI implementation is made of rolled up turds and you lose a node or two from the cluster control plane every day.

      (Large EKS cluster)

  • Until you need to schedule GPUs or other heterogenous compute...

    • Are you saying that running your application in a pile of containers somehow helps that problem ..? It's the same problem as CPU scheduling, we just don't have good schedulers yet.. Lots of people are working on it though

To be fair each of those pods can have dedicated, separate external storage volumes which may actually help and it’s def easier than maintaining 200 iscsi or more whatever targets yourself

I mean, a large part of the point is that you can run on separate physical machines, too.