← Back to context

Comment by hinkley

5 hours ago

> You run observability at your company. But really, you're the cost police. You wake up to a log line in a hot path, a metric tag that exploded cardinality. You chase down the engineer. They didn't do anything wrong, they're just disconnected from what any of this costs.

Somebody didn’t math right when calculating if moving off hostedgraphite and StatsD was going to save us money or boil us alive. We moved from an inordinate number of individual stats with interpolated names to much simpler names but with cardinality and then the cardinality police showed up and kept harping on me to fix it. We were the user and customer facing portion of a SaaS company and I told them to fuck off when we were 1/7 of the overall stats traffic. I’d already reduced the cardinality by 400x and we were months past the transition date and I just wanted to work on anything that wasn’t stats for a while. Like features for the other devs or for our customers.

Very frustrating process. I suspect there’s a Missing Paper out there on how to compress stat cardinality out there somewhere. I’ve done a bit of work in that area but my efforts are in the 20% range and we need an order of magnitude. My changes were more about reducing the storage for the tags and reduced string arithmetic a bit in the process.