Comment by jkaplowitz

4 months ago

Or at least you have to automatically destroy and recreate all nodes / VMs / similar every N days, so that nobody can pretend that any truly unavoidable hand-edits during emergency situations will persist. Possibly also control access to the ability to do hand edits behind a break-glass feature that also notifies executives or schedules a postmortem meeting about why it was necessary to do that.

4 comments

jkaplowitz

vidarh 4 months ago

I know of at least one organisation that'd automatically wipe every instance on (ssh-)user logout, so you could log in to debug, but nothing you did would persist at all. I quite like that idea, though sometimes being able to e.g. delay the wipe for up to X hours might be slightly easier to deal with for genuinely critical emergency fixes.

But, yes, gating it behind notifications would also be great.

9dev 4 months ago

That sounds like the kind of thing that’s amazing, until it isn’t and you know exactly why your day just got a lot worse.
sgarland 4 months ago

Was this Mozilla?

rustystump 4 months ago

Oh no it ran out of disk space because of bug! I will run a command on that instance to free it rather than fix bug. Oh no error now happens half of the time better debug for hours only to find out someone only fixed a single instance…

I will never understand the argument for cloud other than bragging rights about burning money and saving money which never shoulda been burning to begin with.