Comment by binarylogic
4 hours ago
Agree to an extent. There are absolutely unknown unknowns. But I think you'd be surprised how much data is obviously waste. Not the grey area, just pure garbage: health checks, debug logs left in production, redundant attributes.
That's why we break waste down into categories: https://docs.usetero.com/data-quality/categories/overview
But we don't stop there. You can go deeper with reasoning to root out the more nuanced waste. It's hard, but it's possible. That's where things get interesting.
No comments yet
Contribute on Hacker News ↗