Accidentally destroyed production database on first day of a job

8 years ago (np.reddit.com)

273 comments

kshahkshah

Sorry, but if a junior dev can blow away your prod database by running a script on his _local_ dev environment while following your documentation, you have no one to blame but yourself. Why is your prod database even reachable from his local env? What does the rest of your security look like? Swiss cheese I bet.

The CTO further demonstrates his ineptitude by firing the junior dev. Apparently he never heard the famous IBM story, and will surely live to repeat his mistakes:

After an employee made a mistake that cost the company $10 million, he walked into the office of Tom Watson, the C.E.O., expecting to get fired. “Fire you?” Mr. Watson asked. “I just spent $10 million educating you.”

_jal 8 years ago
Seriously. The CTO in question is the incompetent one. S/he failed:
- Access control 101. Seriously, this is pure incompetence. It is the equivalent of having the power cord to the Big Important Money Making Machine snaking across the office and under desks. If you can't be arsed to ensure that even basic measures are taken to avoid accidents, acting surprised when they happen is even more stupid.
- Sensible onboarding documentation. Why would prod access information be stuck in the "read this first" doc?
- Management 101. You just hired a green dev just out of college who has no idea how things are supposed to work. You just fired him in an incredibly nasty way for making an entirely predictable mistake that came about because of your lack of diligence at your job (see above).
Also, I have no idea what your culture looks like, but you just told all your reports that honest mistakes can be fatal and their manager's judgement resembles that of a petulant 14 year-old.
- Corporate Communications 101. Hindsight and all that, but it seems inevitable that this would lead to a social media trash fire. Congrats on embarrassing yourself and your company in an impressive way. On the bright side, this will last for about 15 minutes and then maybe three people will remember. Hopefully the folks at your next gig won't be among them.
My take away is that anyone involved in this might want to start polishing their resumes. The poor kid and the CTO for obvious reasons, and the rest of the devs, because good lord, that company sounds doomed.
- dkrich 8 years ago
  
  Yeah when I read that my first thought was that the CTO reacted that way because he was in fear of being fired himself. I wouldn't be at all surprised if he wrote that document or approved it himself.
  
  13 replies →
Rezo 8 years ago
Here's some simple practical tips you can use to prevent this and other Oh Shit Moments(tm):
- Unless you have full time DBAs, do use a managed db like RDS, so you don't have to worry about whether you've setup the backups correctly. Saving a few bucks here is incredibly shortsighted, your database is probably the most valuable asset you have. RDS allows point-in-time restore of your DB instance to any second during your retention period, up to the last five minutes. That will make you sleep better at night.
- Separate your prod and dev AWS accounts entirely. It doesn't cost you anything (in fact, you get 2x the AWS free tier benefit, score!), and it's also a big help in monitoring your cloud spend later on. Everyone, including the junior dev, should have full access to the dev environment. Fewer people should have prod access (everything devs may need for day-to-day work like logs should be streamed to some other accessible system, like Splunk or Loggly). Assuming a prod context should always require an additional step for those with access, and the separate AWS account provides that bit of friction.
- The prod RDS security group should only allow traffic from white listed security groups also in the prod environment. For those really requiring a connection to the prod DB, it is therefore always a two-step process: local -> prod host -> prod db. But carefully consider why are you even doing this in the first place? If you find yourself doing this often, perhaps you need more internal tooling (like an admin interface, again behind a whitelisting SG).
- Use a discovery service for the prod resources. One of the simplest methods is just to setup a Route 53 Private Hosted Zone in the prod account, which takes about a minute. Create an alias entry like "db.prod.private" pointing to the RDS and use that in all configurations. Except for the Route 53 record, the actual address for your DB should not appear anywhere. Even if everything else goes sideways, you've assumed a prod context locally by mistake and you run some tool that is pointed to the prod config, the address doesn't resolve in a local context.
- unoti 8 years ago
  
  You made a lot of insightful point here, but I'd like to chime in on one important point:
  > - Unless you have full time DBAs, do use a managed db like RDS, so you don't have to worry about whether you've setup the backups correctly.
  The real way to not worry about whether you've set up backups correctly is to set up the backups, and actually try and document the recovery procedure. Over the last 30 years I've seen situations beyond counting of nasty surprises when people actually try to restore their backups during emergencies. Hopefully checking the "yes back this up" checkbox on RDS covers you, but actually following the recovery procedure and checking the results is the only way to not have some lingering worry.
  In this particular example, there might be lingering surprises like part of the data might be in other databases, storage facilities like S3 that don't have backups in sync with the primary backup, or caches and queues that need to be reset as part of the recovery procedure.
  
  2 replies →
- dsr_ 8 years ago
  
  And put a firewall between your dev machines and your production database. All production database tasks need to be done by someone who has permission to cross in to the production side -- a dev machine shouldn't be allowed to talk to it.
  
  1 reply →
- daxfohl 8 years ago
  
  Would you recommend all these steps even for a single-person freelance job? Or is it overkill?
  
  6 replies →
champagnepapi 8 years ago
I agree, it's the fault of the CTO. To me, the CTO sounds pretty incompetent. The junior engineer did them a favor. This company seems like it is an amateur hour operation, since data was deleted so easily by an junior engineer.
- vvanders 8 years ago
  
  Yup, I've heard stories of junior engineers causing millions of dollars worth out outages. In those case the process was drilled into, the control that caused it fixed and the engineer was not given a reprimand.
  If you have an engineer that goes though that and shows real remorse your going to have someone who's never going to make that mistake(or similar ones) again.
  
  21 replies →
- karmajunkie 8 years ago
  
  Yep. I had a junior working for me once a few years ago that made a rather unfortunate error in production which deleted all of several customers' data. I could tell he was on pins and needles when he brought it to me, so I let him off the hook right away and showed him the procedures to fix the issue. He said something about being thankful there was a way to fix the problem, and I just smiled and told him A) it would have been my fault if there hadn't been; and B) he wouldn't have had the access he did without safeguards in place. Then I told him a story about the time I managed to accidentally delete an entire database of quarantined email from a spam appliance I was working on several years earlier. Sadly, my CTO at the time did NOT prepare for that.
  I lost a whole weekend of sleep in recovering that one from logs, and that was when I learned some good tricks for ensuring recoverability....
- dheera 8 years ago
  
  Agreed. Also, why didn't they have a backup of some sort? The hard drive on the server could have failed and it would have been just as bad.
  Sounds like an incompetent set of people running the production server.
  
  10 replies →
justbaker 8 years ago
"It's your first day, we don't understand security so here's the combination to the safe. Have fun!!"
- cwilkes 8 years ago
  
  "we have a bunch of guns, we aren't sure which ones are loaded, all the safeties are off and we modified them to go off randomly"
  
  1 reply →
jdietrich 8 years ago

If someone on their first day of work can do this much damage, what could a disgruntled veteran do? If Snowden has taught us anything, it's that internal threats are just as dangerous as external threats.
This shop sounds like a raging tire fire of negligence.
hashkb 8 years ago
He didn't follow the docs exactly. That doesn't matter, though, your first day should be bulletproof and if it's not, it's on the CTO. The buck does not stop with junior engineers on their first day.
- austenallred 8 years ago
  
  > He didn't follow the docs exactly
  Sure, but having the plaintext credentials for a readily-deletable prod db as an example before you instruct someone to wipe the db doesn't salvage competence very much.
  
  1 reply →
- FLUX-YOU 8 years ago
  
  Don't tell Etsy that
ajeet_dhaliwal 8 years ago

Thanks for Tom Watson quote, I'd never heard it before, it's a good one. Also agree with everything else you just said, this is not the junior devs fault at all.
mschwaig 8 years ago
He might be inept, but in this instance the CTO is mainly just covering his own ass.
- mikeryan 8 years ago
  
  "Yeah the whole site is buggered, and the backups aren't working - but I fired the Junior developer who did it" Is not how you Cover Your Ass ™.
  
  2 replies →

knodi123 8 years ago

I was on a production DB once, and ran SHOW FULL PROCESSLIST, and saw "delete from events" had been running for 4 seconds. I killed the query, and set up that processlist command to run ever 2 seconds. Sure enough, the delete kept reappearing shortly after I killed it. I wasn't on a laptop, but I knew the culprit was somewhere on my floor of the building, so I grabbed our HR woman who was walking by and told her to watch the query window, and if she saw delete, I showed her how to kill the process. Then I ran out and searched office to office until I found the culprit -

Our CTO thought he was on his local dev box, and was frustrated that "something" was keeping him from clearing out his testing DB.

Did I get a medal for that? No. Nobody wanted to talk about it ever again.

S4M 8 years ago

Actually, the CTO should have mailed the dev team saying:

    Hi,

    Yesterday, I thought I was on my local machine and clear the database, while I was in fact on the production server.
    Luckily knodi123 caught it and killed the delete process. This is a reminder that *anybody* can make mistakes, 
    so I want to set up some process to make sure this can't happen, but meanwhile I would like to thank knodil123.

   Best,

   CTO

Demiurge 8 years ago

Sometimes I get reminded about how awesome some of the tech we use is, in this case, transactions :)
blablabla123 8 years ago

Oh god, this is the worst... People make errors, you help them and they don't give you any credit. Hope you are working somewhere else now.
justicezyx 8 years ago

Another stupid cto... You did a great job!
lostlogin 8 years ago

A use for HR!

sethammons 8 years ago

My comment I left there:

Lots of folks here are saying they should have fired the CTO or the DBA or the person who wrote the doc instead of the new dev. Let me offer a counter point. Not that it will happen here ;)

They should have run a post mortem. The idea behind it should be to understand the processes that led to a situation where this incident could happen. Gather stories, understand how things came to be.

With this information, folks can then address the issues. Maybe it shows that there is a serially incompetent individual who needs to be let go. Or maybe it shows a house of cards with each card placement making sense at the time and it is time for new, better processes and an audit of other systems.

The point being is that this is a massive learning opportunity for all those involved. The dev should not have been fired. The CTO should not have lost his shit. The DB should have regularly tested back ups. Permissions and access needs to be updated. Docs should be updated to not have sensitive information. The dev does need to contact the company to arrange surrender of the laptop. The dev should document everything just in case. The dev should have a beer with friends and relax for the weekend and get back on the job hunt next week. Later, laugh and tell of the time you destroyed prod on your first day (and what you learned from it).

justicezyx 8 years ago
The firing order, in theoretical order for preventing future problems:
1. CTO As the one in charge of the tech, allows loss of critical data. If anyone should be fired, it's the cto. And firing this guy apparently will have the greatest positive impact to the company. Assuming they can hire a better one. I think given how stupid this cto is, that should be straightforward.
2. The executives who hired the cto. People hire people similar to themselves, it seems the executives team are clueless about what kind of skills a cto should have. These people will continue fail the dev team by hiring incompetent people, or force them to work in a way that causes problem.
3. Senior devs in the team. Obviously these people did not test what they wrote. If anyone had ever dryrun the training doc, they should prevent the catastrophe. It's a must do in today's dev environment. The normal standard is to write automatic tests for every thing though.
This junior dev is the only one who should not be fired...
- developer2 8 years ago
  
  I'm amazed at how quickly everyone is trying to allocate blame, as if there must be someone upon whom to heap it all on. Commenters on both Reddit and HN are being high and mighty, offering wisdom that they would never have allowed this to take place, while eager to point fingers. I bet far more than half of these commenters have at one time or another worked for at least one company that had this kind of setup, and didn't immediately refuse to work on other tasks until the setup was patched. Hypocrites.
  The fact is, this kind of scenario is extremely common. Most companies I have worked for have the production database accessible from the office. It's a very obvious "no no", but it's typical to see this at small to medium sized companies. The focus is on rushing every new feature under the sun, and infrastructure is only looked at if something blows up.
  Nobody should have been fired. Not the developer, not the senior devs, not the sysadmins, and not the CTO. This should have been nothing more than a wake-up call to improve the infrastructure. That's it. The only blame here lies with the CTO - not for the event having taken place, but only because their immediate reaction was to throw the developer under the bus. A decent CTO would have simply said "oh shit guys, this can't happen again. please fix it". If the other executives can't understand that sometimes shit happens, and that a hammer doesn't need to be dropped on anyone, then they're not qualified to be running a business.
  
  1 reply →
chmike 8 years ago

You are right that this is an opportunity to learn because it is a demonstration of incompetence at many levels. However, this incompetence has consequences that might be fatal for the company. How much time and effort will be required to level up ? As a CEO I would request an independent audit ASAP on this incident and see the real extend of the problem.
As my mother said, if you put a good apple with bad apples, it's not the bad apples that become good.
wvenable 8 years ago
They are in no condition, yet, to run a post mortem. At this moment they're probably still trying to figure out how to get their data back or maybe just close up shop entirely.
You run a post mortem when you're back and running again. They may never be back and running again.
- sethammons 8 years ago
  
  very fair point. My response was mostly in response to "fire the asshats."

xoa 8 years ago

>"i instead for whatever reason used the values the document had."

>They put full access plaintext credentials for their production database in their tutorial documentation

WHAT THE HELL. Wow. I'd be shocked at that sort of thing being written out in a non-secure setting, like, anywhere, at all, never mind in freaking documentation. Making sure examples in documentation are never real and will hard fail if anyone tries to use them directly is not some new idea, heck there's an entire IETF RFC (#2606) devoted to reserving TLDs specifically for testing and example usage. Just mind blowing, and yeah there are plenty of WTFs there that have already been commented on in terms of backups, general authentication, etc. But even above all that, if those credentials had full access then "merely" having their entire db deleted might even have been a good case scenario vs having the entire thing stolen which seems quite likely if their auth is nothing more then a name/pass and they're letting credentials float around like that.

It's a real bummer this guy had such an utterly awful first day on a first job, particularly since he said he engaged in a huge move and sunk quite a bit of personal capital from the sound of it in taking that job. At the same time that sounds like a pretty shocking place to work and it might have taught a ton of bad habits. I don't think it's salvageable but I'm not even sure he should try, they likely had every right to fire him but threatening him at all with "legal" for that is very unprofessional and dickish. I hope he'll be able to bounce back and actually end up in a much better position a decade down the line, having some unusually strong caution and extra care baked into him at a very junior level.

ncantelmo 8 years ago
There's also a high chance that document was shared on Slack. In which case, they were one Slack breach away from the entire world having write access to their prod database.
It's depressing how many companies blindly throw unencrypted credentials around like this.
- kefka 8 years ago
  
  Tell me about it. Fortunately where I work is sane and reasonable about it.
  We have a password sheet. You have to be on the VPN(login/password). Then you can log in. Login/Password(diff from above)/2nd password+OTP. Then a password sheet password.
  I'm still rooting out passwords from our repo with goobers putting creds in sourcecode (yeah, not config files....grrrrr). But I attack them as I find them. Ive only found 1 root password for a DB in there... and thankfully changed!
  
  3 replies →
- tayo42 8 years ago
  
  Slack getting hacked would definitely be a mess. There's going to be so many cloud credentials, passwords, keys, customer info...
- RugnirViking 8 years ago
  
  The exact same slack that he remained in for several hours after being fired. Even worse way to provoke a response from a disgruntled employee...
- CodeWriter23 8 years ago
  
  The document is probably in Google Docs too.

plesiv 8 years ago

Plot twist: CTO or senior staff needed to cover something up (maybe a previous loss of critical business data) and arranged for this travesty to likely happen permitted sufficient number of junior devs went through "local db setup guide" mockery of a doc.

Either that or this is a "Worst fuckup on the first day on job" fantasy piece - I refuse to acknowledge living in the world where alternatives have any meaningful non-zero probability of occurring.

perlgeek 8 years ago

There are no upper bounds on incompetence. I've seen enough WTFs even in companies that didn't seem particularly dysfunctional, and that had some very competent people.
And then it takes only one shitty manager, or manager in a bad mood, to get the innocent junior dev fired.
linkmotif 8 years ago

Yeah I also kind of thought this was fake. Could be real but...

matwood 8 years ago

People will screw up, so you have to do simple things to make screwing up hard. The production credentials should never have been in the document. Letting a junior have prod level access is not that far out of the normal in a small startup environment, but don't make them part of the setup guide. Sounds like they also have backup issues, which points to overall poor devops knowledge.

Not part of this story, but another pet peeve of mine is when I see scripts checking strings like "if env = 'test' else <runs against prod>". This sets up another WTF situation if someone typos 'test' now the script hits prod.

samstave 8 years ago
Heh, or take a Netflix Chaos Monkey approach and have a new employee attempt to take down the whole system on their first day and fire any engineers who built whatever the new employee is actually able to break!
- karlkatzke 8 years ago
  
  Why fire them? It's valuable experience that you are paying a lot for them to gain. Better: hold a postmortem, figure out what broke, and make the people who screwed it up originally fix it. Keep people who screw things up, as long as they also fix it.
  
  1 reply →
- CodeWriter23 8 years ago
  
  Sounds like a technique taught at The Pirate School of Management.
foobarian 8 years ago

We've been calling that the "Volkswagen pattern"
falsedan 8 years ago

> so you have to do simple things to make screwing up hard
No one goes out of their way to screw up; I'd recommend making it easier to recognize when you've made a mistake, and recover from it.
Except for critical business stuff, that needs severe "you cannot fuck this up" safeguards.

quizotic 8 years ago

Yeah, another case of "blame the person" instead of "blame the lack of systems". A while back, there was a thread here on how Amazon handled their s3 outage, caused by a devops typo. They didn't blame the DevOp guy, and instead beefed up their tooling.

I wonder whether that single difference - blame the person vs fix the system/tools predicts the failure or success of an enterprise?

dsr_ 8 years ago

I think it's a major predictor for how pleasant it is for anyone to work at the company, and thus a long term morale and hiring issue.
This is the sort of situation that makes for a great conference talk on how companies react to disaster, and how the lessons learned can either set the company up for robust dependable systems or a series of cascading failures.
Unfortunately, the original junior dev was living the failure case. Fortunately, he has learned early in his career that he doesn't want to work for a company that blames the messenger.
throw20170603 8 years ago
The Amazon DevOp guy was fired for that mistake, just FYI.
- ummonk 8 years ago
  
  Have any proof of this?
  
  2 replies →
- SteveNuts 8 years ago
  
  That's a pretty huge claim you've got there, with no proof.

ajarmst 8 years ago

Assuming the details are correct, this should be considered a win by the junior dev. It only took a day to realize that this is a company he really, really doesn't want to try to learn his profession at.

ajarmst 8 years ago
He should get that laptop back to them IMMEDIATELY. These sound like exactly the sort of douches would try to charge him with theft. (Edit: Why is it not surprising they don't have a protocol in place for managing dismissing staff and, like, getting their stuff back?)
- watwut 8 years ago
  
  Well, the customers database with important data just got nuked, so even if there is protocol, people who would normally do the steps have different things in mind. Laptop and such is least of their concerns.
nashashmi 8 years ago
Nobody hires you if things are perfect. They hire you because there's a problem. It might be a startup or a company just starting a tech sector. Either way they are in their infancy.
- ajarmst 8 years ago
  
  This isn't imperfection. This is beyond incompetence into some sort of Dunning-Kruger zen state. The story describes failures so egregious that the principals have no business taking money from customers.

danmaz74 8 years ago

> The CTO told me to leave and never come back. He also informed me that apparently legal would need to get involved due to severity of the data loss.

I don't know if I should laugh or cry here.

markbnj 8 years ago

Guaranteed the CTO is busily rewriting the developer quide and excising all production DB credentials from the docs so that he can pretend they were never there. While the new guy's mistake was unfortunate in a very small way, the errors made by the CTO and his team were unfortunate in a very big way. The vague threat of legal action is laughable, and the reaction of firing the junior dev who stumbled into their swamp of incompetency on his first day speaks volumes about the quality or the organization and the people who run it. My advice... learn something from the mistake, but otherwise walk away from that joint and never look back. It was a lucky thing that you found out what a mess they are on day 1.

spudlyo 8 years ago

Several years back I worked as a DBA at a managed database services company, and something very similar happened to one of our customers who ran a fairly successful startup. When we first onboarded them I strongly recommended that the first thing we do is get their DB backups happening on a fixed schedule, rather than an ad-hoc basis, as their last backup was several months old. The CEO shuts me down, and instead insists that we focus on finding a subtle bug (you can't nest transactions in MySQL) in one of their massive stored procedures.

It turns out their production and QA database instances shared the same credentials, and one day somebody pointed a script that initializes the QA instances (truncate all tables, insert some dummy data) at the production master. Those TRUNCATE TABLE statements replicated to all their DB replicas, and within a few minutes their entire production DB cluster was completely hosed.

Their data thankfully still existed inside the InnoDB files on disk, but all the organizational metadata data was gone. I spent a week of 12 hour days working with folks from Percona to recover the data from the ibdata files. The old backup was of no use to us since it was several months old, but it was helpful in that it provided us a mapping of the old table names to their InnoDB tablespace ids, a mapping destroyed by the TRUNCATE TABLE statements.

nstj 8 years ago

No disrespect to the OP but this sounds pretty fake. If the database in question was important enough to fire someone immediately over then there wouldn't have been the creds floating around on an onboarding pdf. And involving legal? Has anyone here heard of anything similar? I'm just 1 datapoint but I know I haven't.

jupiter90000 8 years ago

Yeah, I thought it sounded fake as well. I mean things like this happen, but something about the story just doesn't ring true to me.
Macha 8 years ago

Realised the user account is 3 weeks old, which is a red flag for me since it has no posts and the events allegedly happened friday

cbanek 8 years ago

It's not the CTO's fault. It's the document's fault! We should never have documentation again, this is what it has done to us! We need to revert to tribal knowledge to protect ourselves. If we didn't document these values, people wouldn't be pasting them in places they shouldn't be!

femto113 8 years ago

For some years now I've stopped bothering with database passwords. If technically required I just make them the same as the username (or the database name, or all three the same if possible). Why? Because the security offered by such passwords is invariably a fiction in practice, I've never seen an org where they couldn't be dug out of docs or a wiki or test code. Instead database access should be enforced by network architecture: the production database can only be accessed by the production applications, running in the production LAN/VPC. With this setup no amount of accidental (or malicious) commands run by anyone from their local machine (or any other non production environment) could possibly damage the production data.

daxfohl 8 years ago

Side question, as a dev with zero previous ops experience, now the solo techie for a small company and learning ops on the fly, we're obviously in the situation where "all devs have direct, easy access to prod", since I'm the only dev. What steps should I take before bringing on a junior dev?

toomuchtodo 8 years ago

* Local env setup docs should have no production creds in it (EDIT: production creds should always be encrypted at rest)
* new dev should only have full access to local and dev envs, no write access to prod
* you're backing up all of your databases, right? Nightly is mandatory, hourly is better
* if you don't have a DBA, use RDS
That'll prevent the majority of weapons grade mistakes.
Source: 15 years in ops
kefka 8 years ago
Do as best as you can to "find compute room" (laptop, desktop, spare servers on rack that arent being used, .. cloud) , and make a Stage.
Make changes to Stage after doing a "Change Management" process (effectively, document every change you plan to do, so that a average person typing them out would succeed). Test these changes. It's nicer if you have a test suite, but you won't at first.
Once testing is done and considered good, then make changes in accordance to the CM on prod. Make sure everything has a back-out procedure, even if it is "Drive to get backups, and restore". But most of these should be, "Copy config to /root/configs/$service/$date , then proceed to edit the live config". Backing out would entail in restoring the backed-up config.
________________________
Edit: As an addendum, many places too small usually have insufficient, non-existent, or schrodinger-backups. Having a healthy living stage environment does 2 things:
1. You can stage changes so you don't get caught with your pants down on prod, and
2. It is a hot-swap for prod in the case Prod catches fire.
In all likelihood, "All" of prod wouldn't DIAF, but maybe a machine that houses the DB has power issues with their PSU's and fries the motherboard. You at least have a hot machine, even if it's stale data from yesterday's imported snapshot.
- nocha 8 years ago
  
  You missed one of the really nice points of having a stage there. You use it to test your backups by restoring from live every night/week. By doing that, you discourage developing on staging and you know for sure you have working backups!
  
  1 reply →
scarface74 8 years ago

As I said in another post, the least you can do is modify your hosts file so you can't access the production database from your local computer. Then you have to login to a remote computer to access production.
justjico 8 years ago
As adviced somewhere else, before you have a DBA, you should consider buying a hosted service like RDs, that would provide at a minimum backup's and restore points. Even have separate dev and prod accounts on RDS.
- gaius 8 years ago
  
  before you have a DBA
  You never don't have a DBA. If you don't know who it is, it's you! But there will always, always be someone who is held responsible for the security, integrity and availability of the company's asset.
corford 8 years ago

Best rule of thumb whenever you're doing work as a solo dev/ops guy is to always think in terms of being two people: the normal you (with super user privs etc.) and the "junior dev/ops" you who jut started his first day. Whatever you're working on needs to support both variants of you (with appropriate safeguards, checks and balances in place for junior you).
E.g. when deciding how to backup your prod database, if you're thinking as both "personas" you'll come up with a strategy that safely backs up the database but also makes it easy for a non-privileged user to securely suck down a (optionally sanitised) version of the latest snapshot for seeding their dev environment with [ and then dog food this by using the process to seed your own dev environment ].
Some other quick & easy things:
- Design your terraform/ansible/whatever scripts such that anything touching sensitive parts needs out of band ssh keys or credentials. E.g. if you have a terraform script that brings up your prod environment on AWS, make sure that the aws credentials file it needs isn't auto-provisioned alongside the script. Instead write down on a wiki somewhere that the team member (at the moment, you) who has authority to run that terraform script needs to manually drop his AWS credentials file in directory /x/y/z before running the script. Same goes for ansible: control and limit which ssh keys can login to different machines (don't use a single "devops" key that everyone shares and imports in to their keychains!). Think about what you'll need to do if a particular key gets compromised or a person leaves the team.
- Make sure your backups are working, taken regularly, stored in multiple places and encrypted before they leave the box being backed up. Borgbackup and rsync.net are a nice, easy solution for this.
- Make sure you test your backups!
- Don't check passwords/credentials in to source code without first encrypting them.
- Use sane, safe defaults in all scripts. Like another poster mentioned, don't do if env != "test"; do prod_stuff();
- RTFM and spend the extra 20 minutes to set things up correctly and securely rather than walking away the second you've got something "working" (thinking 'I'll come back later to tidy this up' - you never will).
- Follow the usual security guidelines: firewall machines (internally and externally), limit privileges, keep packages up to date, layer your defences, use a bastion machine for access to your hosted infrastructure
- Get in the habit of documenting shit so you can quickly put together a straight forward on-boarding/ops wiki in a few days if you suddenly do hire a junior dev (or just to help yourself when you're scratching your head 6 months later wondering why you did something a certain way)

Etheryte 8 years ago

The author should get their own legal in line - does the contract even allow termination on the spot. If not, the employer is just adding to their own pile of ridiculous mistakes.

FLUX-YOU 8 years ago
Probably. At will employment is pretty common in the US.
- Macha 8 years ago
  
  Even in Europe it's pretty lenient for the first period. Different countries obviously have different maximum probation periods but day 1 would fall into it in most (all?) of them
- walshemj 8 years ago
  
  Are we sure its the USA some of the language used in the poor guy's post on redit implies a non native speaker I am guessing India which is known for treating employees "horrifically".
- gcb0 8 years ago
  
  land of the uninsured, non unionized free
  
  1 reply →

roadbeats 8 years ago

The ending with taking the laptop to home though... He is a modern time Dostoevsky.

scarface74 8 years ago

One of the questions I asked my manager during the interview process was how did he feel about mistakes?

I knew I was being brought in to rearchitect the entire development process for an IT department and that I would make architectural mistakes no matter how careful I was and that I would probably make mistakes that would have to be explained to CxOs.

Whatever the answer he gave me, I remember being satisfied with it.

rsc-dev 8 years ago

https://github.com/search?utf8=%E2%9C%93&q=database+password...

thomastjeffery 8 years ago

Reminds me of my first dev job, when I got a call during lunch:

"The server has been down all day, and you are the only one who hasn't noticed. What did you break?"

"Well, I saw that all the files were moved to `/var/www/`, and figured it was on purpose."

Suffice it to say, I got that business to go from Filezilla as root to BitBucket with git and some update scripts.

Jare 8 years ago

Something tells me their production password was nothing like a 20-char random string...

user5994461 8 years ago

I am the only one who is surprised that he can get the keys to the kingdom on day 1?

Day 1 is when you setup your desk and get your login. Then go back to HR to do the last hiring paperwork.

It should take a good week before a new employee is able to fuck up anything. Really.

anotheryou 8 years ago

How long do you want to adjust the height of your chair? Setting up the dev enviroment often takes ages. Why wouldn't it be the first thing to do? There will be enough progress bars while updating something like visual studio to finde time to re-adjust the chair.

runnr_az 8 years ago

Hilarious. I wonder if it's true.

solarengineer 8 years ago
This happened to a friend at a new job a few weeks ago. He wasn't fired, though.
- Jare 8 years ago
  
  If the bit about no working backups is also true, he's likely to need a new job anyway. :)
cynicalbastard 8 years ago

almost certainly not.

andreasgonewild 8 years ago

I did the same thing early on in my career. Shut down several major ski-resorts in Sweden for an entire day during booking season by doing what we always did, running untested code in production to put out fires. Luckily, my company and our customers took that as a cue to tighten up the procedures instead of finding someone to blame. I hear this is how it works in aviation as well, no one gets blamed for mistakes since that only prevents them from being dealt with properly. Most of us are humans, humans make mistakes. The goal is to minimize the risk for mistakes.

orliesaurus 8 years ago

I stopped believing reddit posts a long time ago

Myrmornis 8 years ago
Exactly, the post is very clichéd. I have about 75% belief that it's fictional. I guess it could be sort of entertaining to see how easy it is to get a few hundred software engineers on reddit and hacker news worked up into a sympathetic and self-righteous frenzy with a simple and entirely fictional paragraph posted for free from a throwaway account.
- ifdefdebug 8 years ago
  
  I am about 101% it's fake. "Unfortunately apparently those values were actually for the production database (why they are documented in the dev setup guide i have no idea)" - yeah, no. Had you told me you were able to screw the production db up because it had no su password set, you might have got me. But this is bullshit.
camus2 8 years ago

this, it looks like an hoax.

vinceguidry 8 years ago

Technical infrastructure is often the ultimate in hostile work environments. Every edge is sharp, and great fire-breathing dragons hide in the most innocuous of places. If it's your shop, then you are going to have a basic understanding of the safety pitfalls, but you're going to have no clue as to the true severity of the situation.

If you introduce a junior dev into this environment, then it's him that is going to discover those pitfalls, in the most catastrophic ways possible. But even experienced developers can blunder into pitfalls. At least twice I've accidentally deployed to production, or otherwise ran a powerful command intended to be used in a development environment on production.

Each time, I carefully analyze the steps that led up to running that command and implemented safety checks to keep that from happening again. I put all of my configuration into a single environment file so I see with a glance the state of my shop. I make little tweaks to the project all the time to maintain this, which can be difficult because the three devs on the project work in slightly different ways and the codebase has to be able to accommodate all of us.

While this is all well and good, my project has a positively decadent level of funding. I can lavish all the time I want in making my shop nice and pretty and safe.

A growing business concern can not afford to hire junior devs fresh out of code school / college. That's the real problem here. Not the CTO's incompetence, any new-ish CTO in a startup is going to be incompetent.

The startup simply hired too fast.

watwut 8 years ago
The same thing could happen to senior. In particular, to tired overworked senior. It is more likely to happen to junior, because junior is likely to be overwhelmed. However, mistakes like this happen to prole of all ages and experience levels.
Seniority is what makes you not put the damm password into set up document. That was the inexperienced level of mistake. Forgotting to replace it while you are seting up day one machine is mistake that can happen to anyone.
- vinceguidry 8 years ago
  
  True, but a senior engineer, even if he is never able to make architecture decisions, can still be held accountable for knowing better. That is precisely why they are paying him the big bucks.
  If a shop is being held together with duct tape and elbow grease, then you should have known that going in, and developed personal habits to avoid this sort of thing. Being overworked and tired isn't an excuse. Sure, the company and investors have to bear the real consequences, but you as an IC can't disclaim responsibility.

jacquesm 8 years ago

This company has a completely different problem: no separation of duties. Start with talking to the CTO how this could have happened in the first place, re-hire the junior dev.

After all, if the junior dev could do it, so can everybody else (and whoever manages to get their account credentials).

ww520 8 years ago

When it comes to backup, there are two types of people, ones who do backup and ones who will do backup.

jjm 8 years ago

This is purely the fault of the entire leadership stack.

From Sr dev/lead dev, dev manager, architect, ops stack, all the directors, A/S/VPs, and finally the CTO. You could even blame the CEO for not knowing how to manage or qualify a CTO. Even more embarrassing is if your company is a tech company.

I think a proper due diligence would find the fault in the existing company.

It is not secure to give production access and passwords to a junior dev. And if you do, you put controls in place. I think if there is insurance in place some of the requirements would have to be reasonable access controls.

This company might find itself sued by customers for their prior and obviously premeditated negligence from lack of access controls (the doc, the fact they told you 'how' to handle the doc).

scarmig 8 years ago

The Junior dev does bear a small amount of blame, if you really want to go the blameful route.
But figuring out who to blame is toxic. You've got to go for a blameless culture and instead focus on post mortems and following new and better processes.
Things can absolutely always go to shit no matter where you work or how stupidly they went to shit. What differentiates good companies from bad ones is whether they try to maximize the learning from the incident or not.

aidos 8 years ago

Ahhhhh haaaa yeah.....I've done that.

It was the second day, and I only wiped out a column from a table, but it was enough to bring business for several hundred people down for a few hours. It was embarrassing all round really. Live and learn though - at least I didn't get fired!

dennisgorelik 8 years ago

Obviously this is mostly CTO's screw up.

But the junior dev is not fully innocent either: he should have been careful about following instructions.

For extra points (to prove that he is a good developer) - he should have caught that screw up with username/passwords in the instruction. Here's approximate line of reasoning:

---

What is that username in the instruction responsible for? For production environment? Then what would happen if I actually run my setup script in production environment? Production database would be wiped? Shouldn't we update setup instruction and some other practices in order to make sure it would not happen by accident?).

---

But he it is very unlikely that this junior dev would be legally responsible for the screw up.

gregopet 8 years ago

I destroyed an accounting database at a company during a high school summer job.

A mentor was supervising me and continually told me to work slower but I was doing great performing some minor maintenance on a Clipper application and didn't even need his "stupid" help ... until I typed 'del .db' instead of 'del .bak'. Oooops!

Luckily the woman whose computer we were working on clicked 'Backup my data' every single day before going home, bless her heart, and we could copy the database back from a backup folder. A 16 year old me was left utterly embarrassed and cured of flaunting his 1337 skillz.

icedchai 8 years ago

Obviously not the new engineer's fault. Unfortunately, aspects of this are incredibly common. On three jobs I've had, I've had full production access on day one. By that, I mean everyone had it...

blackflame7000 8 years ago

After adding up the number of egerious errors made by the company, I'd almost be inclined to say the employee has grounds for wrongful termination or at least fraudulent representation to recoup moving expenses.

Myrmornis 8 years ago

Story sounds fictional to me.

kilroy123 8 years ago

He's / she's better off not working at this place. So many things wrong. Not having a backup is the number 1 thing.

I could see having a backup that is hours old, and losing many hours of data, but not everything.

pjdemers 8 years ago

Even startups have contracts with their customers about protecting the customer's data. If it is consumer data, there are even stricter privacy laws. Leaving the production database password lying around in plain text is probably explicitly prohibited by the contracts, and certainly by privacy laws. The CTO should pay him for the rest of the year and give him a great reference for his next job, in return for him to never, ever, ever, tell anyone where he found the production password.

codezero 8 years ago

Here's why I think this is fake:

A company with 40 devs and over 100 employees that lost an entire production db would have surfaced here from the downtime. Other devs would corroborate the story.

Analemma_ 8 years ago
I'm also skeptical, but this isn't necessarily true. There's plenty of software being written outside the HN bubble that's totally invisible to us. What if this was some shipping logistics company in Texas City? We'd never know about it; they wouldn't have a trendy dev blog on Medium.
- codezero 8 years ago
  
  Good point.
CodeWriter23 8 years ago

Assuming the CTO was honest about what happened.

alexfi 8 years ago

I always wonder, why IT companies don't test their backups? Even if it's the prod db, it should be tested on a regular base. No blame to the dev.

sandGorgon 8 years ago

We were paying for RDS right from when we were a 2 man startup. Zero reasons to not have a dB service that is backed frequently by a competent team.

watwut 8 years ago

He needs to return the laptop asap, like now. They are in full emotional mode and can overreact to what they might perceive as another bad act too.

learntofly 8 years ago

I don't work in tech but I'm an avid HN reader.

I'm surprised a junior dev on his first day isn't buddied up with an existing team member.

In my line of work, an existing employee who Transferred from another location would probably be thrown in at the deep end but someone who is new would spend some time working alongside someone who is friendly and knowledgable. This seems the decent thing to do as humans.

siliconc0w 8 years ago

Yeah this infra/config management sounds like land-mine / time bomb incompetence territory. You just were the unlucky one to trigger it. Luckily this gives you an opportunity to work elsewhere and hopefully be in a better place to learn some good practices - which is really what you're after as a junior dev anyway.

anorsich 8 years ago

Lucky junior dev! He has figured out a bad company to work for in his first work day. Good luck finding a new job!

falsedan 8 years ago

Also, this is going to look great on their resumé, and be the perfect response to the "tell us a time when you made a mistake" interview question.

tacostakohashi 8 years ago

Everybody agrees that the instructions shouldn't have even had credentials for the production database, and the lion's share of the blame goes to whoever was responsible for that.

There is still a valuable lesson for the developer here though - double check everything, and don't screw up. Over the course of a programming career, there will be times when you're operating directly on a production environment, and one misstep can spell disaster - meaning you need to follow plans and instructions precisely.

Setting up your development environment on your first day shouldn't be one of those times, but those times do exist. Over the course of a job or career at a stable company, it's generally not the "rockstar" developers and risk-takers that ahead, it's the slow and steady people that take the extra time and never mess up.

Although firing this guy seems really harsh, especially as he had just moved and everything, the thought process of the company was probably not so much that he messed up the database that day, but that they'd never be able to trust him with actual production work down the line.

wdewind 8 years ago
No, sorry, and it's important to address this line of thinking because it goes strongly against what our top engineering cultures have learned about building robust systems.
> Over the course of a programming career, there will be times when you're operating directly on a production environment, and one misstep can spell disaster
These times should be extremely rare, and even in this case, they should've had backups that worked. The idea is to reduce the ability of anyone to destroy the system, not to "just be extra careful when doing something that could destroy the system."
> Although firing this guy seems really harsh, especially as he had just moved and everything, the thought process of the company was probably not so much that he messed up the database that day, but that they'd never be able to trust him with actual production work down the line.
Which tells me that this company will have issues again. Look at any high functioning high risk environment and look at the way they handle accidents, especially in manufacturing. You need to look at the overarching system than enabled this behavior, not isolate it down to the single person who happened to be the guy to make the mistake today. If someone has a long track record of constantly fucking up, yeah sure, maybe it's time for them to move on, but it's very easy to see how anyone could make this mistake and so the solution needs to be to fix the system not the individual.
In fact I'd even thank the individual in this case for pointing out a disastrous flaw in the processes today rather than tomorrow, when it's one more day's worth of expensiveness to fix.
Take a look at this: https://codeascraft.com/2012/05/22/blameless-postmortems/
- tacostakohashi 8 years ago
  
  I violently agree with you.
  All I'm saying is that there are times when it is vital to get things right. Maybe it's only once every 5 or 10 years in a DR scenario, but those times do exist. Definitely this company is incompetent, deserves to go out of business, and the developer did himself a favor by not working there long-term, although the mechanism wasn't ideal.
  I'm just saying that the blame is about 99.9% with the company, and 0.1% for the developer - there is still a lesson here for the developer - i.e., take care when executing instructions, and don't rely on other people to have gotten everything right and to have made it impossible for you to mess up. I don't see it as 100% and 0%, and arguing that the developer is 0% responsible denies them a learning opportunity.
  
  2 replies →
bm1362 8 years ago

While working on AWS, we had data corruption caused by a new feature launch. Deployments took ~6 weeks so the solution was to use GDB to flip a feature flag in memory for about 120k servers.
yjftsjthsd-h 8 years ago

> There is still a valuable lesson for the developer here though - double check everything, and don't screw up.
"Double check everything" is a good lesson, because we all can and should practice it.
"Don't screw up" is not useful advice because it's impossible. There's a reason we don't work like that... Who needs backups? Just don't screw anything up! Staging environment? Bah, just don't screw up deployments! Restricted root access? Nah, just never type the wrong command. No, we need systems that mitigate humans screwing up, because it will happen.
watwut 8 years ago

> the thought process of the company was probably not so much that he messed up the database that day, but that they'd never be able to trust him with actual production work down the line.
I think that they simply acted emotionally and out of fear, anger and stress. The vague legal threat and otherwise ignoring this dude bother suggest it. The way events unfolded, it does not sound like much rational thinking was involved.

dk8996 8 years ago

Cool story but I think this is fake. Since there are 40 people in the company, it seems like at least a few people before him followed the onboarding instruction. I just don't believe that there would be that many people that a) didn't do the same thing he did or b) change the document.

consultSKI 8 years ago

Repeat after me, while clicking your heels together three times, "It is not my fault. It is not my fault. It is not my fault." It was obvious as I read your account that you would be fired. A company that allowed this scenario to unfold would not understand that is was their fault.

knodi123 8 years ago

I was only granted read-only access to the Prod DB last week, after achieving 6 months of seniority.

posterboy 8 years ago

I would assume this was mocked to test if the intern could follow simple instructions, to provide a lecture for the huge consequences of small mistakes and to have a viable reason to fire consequently; but I'm wearing my tin foil hat right now, too.

chmike 8 years ago

It is really unfair to have fired him. The OP is not the one that sould have been fired. The guy in charge of the db should be fired and the manager who fired the OP should be fired too. And, by the way, the guy in charge of the backups too.

justicezyx 8 years ago

Isn't this new person deserve a peer bonus by discovering a production risk?

laithshadeed 8 years ago

I would suggest you, once this sorted out, to publicly mention the company name so no other Engineer will fail in this trap again. This will be lesson for them to properly follow basic practices for data storage.

albertini_89 8 years ago

Unfortunately, software companies like that are everywhere, the guy is learning and screws up a terribly designed system, the blame is on the "senior" engineers that set up that enviroment.

brittonrs26 8 years ago

My question is, why in the world did they publish someone's production credentials in an onboarding document? That has to be a SOX compliance violation at the very least.

stevesun21 8 years ago

The CTO should be fired immediately!

If I didn't read wrong, they write poduction db credential in first day local dev env instruction! WTF.

This CTO sounds to me even worse than this junior developer.

jaunkst 8 years ago

So a script practically set up the machine with the nuclear football by default, and then you where expected to diffuse it before using it. That is not your fault.

seattle_spring 8 years ago

I have a feeling the CTO was actually one of those "I just graduated bootcamp and started a website, so I can inflate my title 10x" types.

feinstruktur 8 years ago

Should have job title changed to Junior Penetration Tester and be rewarded for exposing an outfit of highly questionable competence.

Spooky23 8 years ago

Firing the guy seems drastic but understandable. Implying that they are going to take legal action against him is ridiculous!

b33pr 8 years ago

So the company's fault. Embarrassing they tried to blame the new guy. So many things wrong with this.

OOPMan 8 years ago

Wow. What a train wreck. This is why the documentation I write contains database URIs like:

USER@HOST:PORT/SCHEMA

Grazester 8 years ago

It was their fault, plain and simple.

madaxe_again 8 years ago
How is it the FNGs fault that they have no backups, no DR plan, and production DB details freely available and in the setup guide? The company is entirely at fault.
- Tharkun 8 years ago
  
  I think the "their" was a plural referencing the company, not the dev.
gramstrong 8 years ago

Did you read the details? I disagree. The dev is probably better off not working in such a volitile environment. They'll be better off working somewhere where they can learn some best practices, possibly somewhere that doesn't have the possibility of wiping a production DB because you ran some tests from a developer's machine. That's incorrigible.
yongjik 8 years ago

> It was their fault, plain and simple.
<off-topic>A wonderful example of the shortcoming of the singular "they"... :P

askz 8 years ago

I really suggest that OP sends this thread to HR & others. And this isn't sarcasm

nojvek 8 years ago

What company is this? CTO deserves some internet slapping.

jksmith 8 years ago

Obviously not a ssae 16 environment.

pier25 8 years ago

Either the CTO and his dev team are ridiculously stupid, or this was on purpose.

ice109 8 years ago

Lots of people in the thread are commenting how surprised they are that a junior dev has access to production db. Both jobs I've had since graduating gave me more or less complete access to production systems from day one. I think in startup land - where devops takes a back seat to product - it's probably very common.

leokennis 8 years ago
I work for a large bank as OPS engineer. The idea that I could even read a production database without password approval from someone else is too crazy to consider. Updating or deleting takes a small committee and a sizeable "paper trail" to approve.
Sometimes when I read stories like these, I think it's no wonder a company like WhatsApp can have a billion customers with less than 100 employees. And then I make some backups to get that cozy safe feeling again.
- mrtksn 8 years ago
  
  Probably because your industry is regulated.
abuani 8 years ago
Which doesn't abdicate responsibility from the CTO of the company to have practices in place that could have prevented this. While I'm going to hold my breath on being threatened with legal claims, to be fired for something that any person in the building could have done doesn't sound like a conducive environment to work in.
- ice109 8 years ago
  
  absolve not abdicate. you abdicate a throne, you are absolved of responsibility
  
  11 replies →
jacquesm 8 years ago

> I think in startup land - where devops takes a back seat to product - it's probably very common.
It is, but it need not be. It's pretty easy to set up at least some backup system in such a way that whoever can wipe the production systems can't also wipe the back-up.
city41 8 years ago

In this particular case, the OP indicated his team size was 40 and the whole company was about 100. I'd argue that's beyond "everyone has prod access" size.
https://www.reddit.com/r/cscareerquestions/comments/6ez8ag/a...
conanbatt 8 years ago

Its a pretty gross error to me to have direct db access. Obviously in any stack you could push code that affects the db catastrophically anyway, but in dev mode you should never connect directly to production database, not only for this error but for general data integrity.
CTO needs to put a theatre to not get fired, because he is ultimately responsible.
empath75 8 years ago
As a junior noc analyst at a Fortune 500 company, I had root access to almost the entire corporate infrastructure from day one. Databases, front ends, provisioning tools, everything.
- scarface74 8 years ago
  
  It's not about having access to the production database. It's about having an example script that can do catastrophic things and having the production username and password in the example docs.
  I also had production access to everything from day one. The first thing I did was setup the hosts files in the various dev servers - including my own computer so I couldn't access the databases from them.
  I have to remote into another computer to access them.
thawab 8 years ago

In that case don't you think they should informed him that this is a production environment so don't fuck it up? Giving your junior dev a front seat on your production without proper communication is a disaster waiting to happen.
gaius 8 years ago

I think in startup land - where devops takes a back seat to product - it's probably very common
Perhaps with hipster databases like MongoDB that are insecure out of the box, but most grown-up, sensible DBs have the concept of read-only users, and also it is trivial to set up such that you can e.g. DELETE data in a table without being able to DROP that table.
I'll wager any startup that does like you say has devs that do SELECT * FROM TABLE; on a 20-col, million+ row table only actually wanting one value from one row, but they filter in their own code... Yes I have seen this more times than I care to count.
fredsir 8 years ago

I agree.
But not in the commands to run in a local-dev-setup-guide that purges the db it points to.
If anyone should be fired for that, it's the CTO. He must suck at his job at the very least, and junior dev should get an apology.
FanaHOVA 8 years ago

> I think in startup land - where devops takes a back seat to product - it's probably very common.
Not focusing on devops and putting your prod db credentials in plain sights are VERY different things. It's really really easy to do, especially if you are using Heroku or something like that. Same thing goes with database backups. I worked at multiple startups (YC and non) and they all had the basics nailed down, even when they were just 3-5 employees
scriptkiddy 8 years ago

The problem isn't so much that a junior had access to the production Db. The problem is that the junior's dev setup had access to the production Db and could nuke the whole thing with a few misplaced keystrokes. I'm working on a product currently where I am the only dev. I have a pretty large production Db. I also have a smaller clone of that same Db on my local machine for development purposes. I can only access the production Db by directly shelling into the machine it's running on or performing management commands on one of the production worker machines(which I also need to shell into). This was not very difficult to set up and ensures that my development environment cannot in any way affect the production environment.
Also, why even distribute the production credentials at all? Only the most senior DBAs or devs should have access to production credentials.
mbesto 8 years ago

I've done about ~40 or so technology due diligence projects for investors of tech companies. You'd be amazed how many security flaws there are out there. One of the most simple ones - storing production credentials in the git repo.
draw_down 8 years ago

Sure, there exist reasons to do that. It's still a bad idea, but, ok.
But there is no reason to write the production DB credentials in a document, especially as an "example". That is monumentally dumb. It amounts to asking for this to happen.
SatvikBeri 8 years ago
We give everyone access to production systems, but even if someone deleted everything from production, we can restore everything in ~20 minutes (this has happened), and if that process fails, we have backups on s3 that can be restored in a couple of hours (and this is tested regularly, but thankfully hasn't happened yet), and even if that fails...
There's a reason why it's called disaster recovery and prevention.
- posixplz 8 years ago
  
  Why try to justify stupid behavior and absent security controls with the idea that your downtime is "only" 20 minutes? How silly.
giancarlostoro 8 years ago

I've only had one job after college, and I am still there almost a year after being hired. The first few months I only had access to my own local copy of the production DB. Though there's a reason or two why I wouldn't be outright trusted one of them stemming from me being a junior.
leesalminen 8 years ago
Wow, really?!? I've never granted access to prod databases on day one, junior or not.
I thought that was just SOP.
- samstave 8 years ago
  
  Can you send me your password again, I forgot it. Also, please reply to those emails from emily - or just delete them... they are cluttering up your inbox, and I am tired of having to sort through your guys' personal crap.
  Thanks
justbaker 8 years ago

Same. I had instructions from a competent developer however. I would still blame whoever allowed production access as part of application setup, as well as the fact there isn't a process to back up this production data.
kwillets 8 years ago

It's not that this shouldn't happen, but that it does happen and has to be dealt with as the potential impact scales up. Having production creds on day 1 isn't the same as day 500.
ajarmst 8 years ago

Perhaps, but in 2017 it is gross fiduciary malpractice not to have backup systems in place for production data and code. It would be grounds for a shareholder suit against the principals.
justicezyx 8 years ago

Common does not mean it's correct or has to be or whatever. Trump thinks climate change is hoax, does not mean that it's convincing...

gcb0 8 years ago

plot twist: dev will learn Monday this is a initialization joke and the whole company is laughing of all the threads he or she starts here and on reddit.

coldtea 8 years ago
plot twist: the dev attempts to off himself, and the company stops "laughing off".
- user5994461 8 years ago
  
  plot twist: the dev is restored from backups
  
  3 replies →

sitepodmatt 8 years ago

Name and shame.. The CTO stinks of incompetence, and surprised he/she has managed to retain any competent staff (perhaps he/she actually hasn't). What a douchebag. You are not to blame.

draw_down 8 years ago

I worked with someone who did this, early in my career. His bacon was saved by the fact that a backup had happened very soon before his mistake.

His was worse though, because he had specifically written a script to nuke all the data in the DB, intending it for test DBs of course. But after all that work, he was careless and ran it against the live DB.

It was actually kind of enlightening to watch, because he was considered the "genius" or whatever of my cohort. To wit, there are different kinds of intelligence.

jlebrech 8 years ago

I fucked up a table once by setting the column of every record to true, but I had asked about changing the code to require a manual sql query a few weeks prior so it could have been prevented.

robert2017 8 years ago

This story can't be true. If it is, obviously the junior dev is much better off working elsewhere. Why is this company in business and hiring people?

holydude 8 years ago

People are designed to make mistakes. We should learn from them and try to be more understanding.

codersbrew 8 years ago

small shops aren't always perfect. The engineering team should not allow junior devs to hit DB. If you are that vulnerable to coding mistakes you shouldn't be hiring junior devs.

gonzo41 8 years ago

Understandable that hey got fired. I image there would have been a quite emotional response from the business when this happened, but that doesn't mean it was necessarily the most appropriate response.

--Unfortunately apparently those values were actually for the production database (why they are documented in the dev setup guide i have no idea).--

Someone else should have been fired if this is true.

jacquesm 8 years ago
No, it's inappropriate. When the process fails you don't fire the junior employee that showed you just how incompetent the organization is. You fix the problem.
Firing this guy does nothing, fixing the problem does, but requires those higher up to admit the mistake was theirs to begin with.
- gonzo41 8 years ago
  
  I should clarify, I don't think its appropriate, But I see why it happened. The business was panicking. His firing was kinda on the cards. Probably a blessing in disguise for this guy.