Comment by jameshart

2 months ago

Genuine kubernetes scaling strategy: add a do-nothing container that runs with a lower priority than your real workloads, that requests half a machine’s worth of mcpu.

When you deploy a new container, and all your nodes are fully allocated, that low priority container will get evicted, and your container will immediately get scheduled in its place. Then k8s will try to find somewhere to put that half-machine container. If it finds somewhere it fits, it’ll schedule it. If not, it’ll trigger your cluster auto scale to add a new node where that task can run, making sure the next container you want to deploy has some readily available capacity to drop on to.