← Back to context

Comment by anaphor

6 years ago

Replication is not a backup as was already mentioned. A great example of this is when the KDE project almost lost all of their Git repos because they were mirroring a corrupted copy of the data. https://www.phoronix.com/scan.php?page=news_item&px=MTMzNTc

Fortunately, git is a DVCS, so anyone who checks out a repo has a complete copy of it.

Now, granted, it'd be a huge pain to track down all the people who had copies of the 1,500 different repos, and try to find as up-to-date as possible of a version of each, but I doubt they got anywhere close to potentially losing all their source code.

Incidentally this shows why it's a good idea to sync your repo to GitHub, even if the canonical repo is elsewhere: in addition to the usual reasons of incentivizing some contributors by giving them "GitHub credit", and increasing visibility of your project's code, GitHub can serve as a backup!

Also, on a side-note, 1,500 separate repositories?! That sounds way overkill. I wonder if they'd benefit from having a monorepo.

  • > 1,500 separate repositories?! That sounds way overkill. I wonder if they'd benefit from having a monorepo.

    No it doesn't. Github has at least 20 million public repositories. Would they benefit by combining them into a monorepo?

    • GP is talking about the KDE project, not the entirety of GitHub.

      And yes, a monorepo is usually the best approach in most cases for a project or even an entire company.

A backup is a replication of the live dataset, although, usually out of sync to be useful when the main dataset goes bad.

  • You might want to read the Wikipedia definition, because you're technically mistaken.

    https://en.m.wikipedia.org/wiki/Backup

    • That's a long article; please quote the part you're referring to so we're all looking at the same text.

      > a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event

      Since a "replica" is a copy, that seems technically correct.

      3 replies →

  • The out of sync part is rather important when something accidently get deleted from the live dataset.