Comment by otterley

6 hours ago

> Finally, regarding the $300k/year cost (which many here seem to be horrified by) - it seems I wasn't clear enough in the blog. 200 pods was not the entire fleet, and it was not statically set. It was a single cluster at peak time. We have multiple clusters, each with their own traffic patterns and auto-scaling configurations. The total cost was $25k/month when summed as a whole.

So, then, what do you estimate the actual savings of the transition to be, taking into account only the component in question and its actual resource needs? (i.e. not simply projecting based on a linear multiple of peak utilization).

I'm going to be a little harsh here, and please forgive me: intellectual dishonesty, especially when the hard numbers are easily determinable, is something I've denied engineers' promotions for. It's genuinely impressive that you've saved the company money, but $500k/year based on peak projections is a very different number than, say, $100k/year in actual resources saved over the full course of it.

1 comment

otterley

nirb89 6 hours ago

200 pods was peak allocation on a specific cluster, not total sustained cost for all of prod. The savings are taken by quite literally looking at the last month's bill on the cloud, compared to the new one after all optimizations applied and resources were aligned.