← Back to context

Comment by mrheosuper

3 days ago

[flagged]

> Microsoft had to collaborate with GitHub to invent the Virtual File System for Git (VFS for Git) just to make this migration possible. Without VFS, a fresh clone of the Office repository (a shallow git clone would take 200 GB of disk space) would take days and consume hundreds of gigabytes.

  • It takes less than an hour on my third world apartment wifi to download Call of Duty Modern Warfare remake which is over 200 gygabytes. Since we're not talking about remote work here, I think Microsoft offices and servers (probably on local network) might have managed similar bandwidth back then.

  • Having had yesterday the dubious pleasure of using MS Word for the first time in a decade, I can safely affirm that they could have have just piped the whole Office repo to the Windows equivalent of /dev/null and nothing of value would have been lost.

    • The worst part about Word is that it has been feature complete since Office 97, except they've made the UI worse each and every version since then. I wish I could get excited about a new version of Office or WordPerfect, but neither Microsoft or Corel has figured out how to innovate in the past three decades. And no, slapping """AI""" in there isn't the solution. There are so many possibities but they just sort of do nothing with it now that they make a few billion a month on Microsoft 365 subscriptions.

Don’t do this on a repository with 35+ years of history! That’s all valuable information you want to keep.

  • Anything before Office 2003 you can delete. Anything after Office 2003 you can also delete. There, saved you a few terabytes.

If it were that simple, would 100s of engineers spend so much time and effort? They did what they have to and spent the time and energy to maintain some semblance of commit and change history.

  • GP has a valid point. We had a Git repo managed in BitBucket that was gigantic because it contained binary files and the team didn’t know about LFS and storing them in an external tool like Artifactory. So checkouts took forever and even with shallow clones it took forever. With a CI/CD system running constantly and tests needing constant full coverage and hundreds of developers well it eats into developers time. We can’t just prune all the branches well because of compliance rules.

    So we ended up removing all the binary artifacts before cloning into a new repo then making the old repo as read only.

    Microsoft seemed to want to mirror everything rather than keep source depot alive.

    We had another case where we had a subversion system that went out of security compliance that we simply ported to our git systems and abandoned it.

    So my guess is they wanted everything to look the same and not just importing the code.

  • > If it were that simple, would 100s of engineers spend so much time and effort?

    Taking into acvount that they rounded corners in Office, I would say, yes.