← Back to context

Comment by trane_project

8 months ago

I interviewed there once and they asked me what I would do if a service broke after a deployment. I said the first step was to revert to the last known good version and then investigate. Color me surprised when that was not the answer they expected.

Cloudflare's internal release tool suggests revert when monitoring detects failures during deployment, so this question doesn't describe Cloudflare's practices. There must have been something more to it, or it was a misunderstanding.

That's strange. What was the "correct" answer?

  • If I ever interview at Cloudflare and get this question I might answer with "call the sales team and have them fix it by selling someone an enterprise subscription paid upfront by the decade" just to see if the interviewers read Hacker News :P

  • They wanted me to roll out a fix first. Apparently rolling back first was not moving fast enough for the interviewer’s liking.

    • Depending on the service’s criticality, the cost of rolling back versus pushing a fix, service dependencies in their environment… having to push a fix might have been the better approach.

      Without more details about the environment, it is a 50/50 call.