Comment by baq
7 hours ago
I disagree with just about everything you said being a problem except the process of cleaning up is absolutely required.
Notably feature flags triggering incidents is expected and desired vs the alternative of shipping the code and having to roll a release back because there is no other way to remove the feature from prod.
In a company the size of Shopify people flipping their feature flags would very often impact *other teams*, and like I said feature flags got abused with even seemingly innocuous changes being put behind them or being left long periods of time before being fully used.
When someone else flips a flag that impacts your team and they have no idea they even caused a problem, it becomes very difficult to resolve the issue. Usually you can check for recent deploys, instead you have to go and guess at which feature flag which was recently flipped could possibly be affecting your code. I experienced this several times.
Also, it was actually more desirable for most of these things to go straight to production. Test it properly before shipping, then when you ship it soaks on a 5% traffic canary at which point you can monitor and cancel the deploy if you see errors. That is generally safer than a feature flag rollout unless you are doing something very high impact/risk, in large part because it gives any other team affected by your rollout the ability to respond and be able to easily find the source of errors.
In my org it was a fairly common failure mode to ship something and accidentally cause an issue for another team. Usually it was other teams/orgs shipping things that impacted us.