Comment by dwaite
2 years ago
First step would be to have them acknowledge a documented behavior which was part of their original design 16 years ago, is something that needs to be fixed.
As someone who has used git and GitHub extensively over that time, none of what the author documented was a surprise to me.
However, I also remember when people were trained to do a "Save As" when preparing a final Word document or Powerpoint for sharing with a third party. That certainly bit enough business users that Microsoft eventually changed the default behavior.
What about Save As bit people?
It's not doing Save As that bit people. Think of a .doc file as a bad database format. It gets lots of in-place overwrites, and fragments of old versions stick around.
I can't find a lot that discusses it, but here's one mention: https://news.ycombinator.com/item?id=35252331
Right, OLE documents are comparable to read/write filesystems inside a file. The only mechanism given to make sure it is sparse was to create a new filesystem by having the application walk the existing one, basically a copy-based garbage collection.
Powerpoint files can be megabytes larger due to unused graphic artifacts; Word documents may contain older revisions of the text that contain deleted sections. Other things like the MSI installer file format are also OLE documents.
Microsoft eventually made Save in Office apps always create a new sparse filesystem to prevent these problems.