Comment by koolba

2 months ago

A well laid out history of logical changes makes reviewing complicated change sets easier. Rather than one giant wall of changes, you see a series of independent, self contained, changes that can be reviewed on their own.

Having 25 meaningless “wip” commits does not help with that. It’s fine when something is indeed a work in progress. But once it’s ready for review it should be presented as a series of cleaned up changes.

If it is indeed one giant ball of mud, then it should be presented as such. But more often than not, that just shows a lack of discipline on the part of the creator. Variable renames, whitespace changes, and other cosmetic things can be skipped over to focus on the meat of the PR.

From my own experience, people who work in open source and have been on the review side of large PRs understand this the best.

Really the goal is to make things as easy as possible for the reviewer. The simpler the reviews process, the less reviewer time you’re wasting.

> A well laid out history of logical changes makes reviewing complicated change sets easier.

I've been on a maintenance team for years and it's also been a massive help here, in our svn repos where squashing isn't possible. Those intermediate commits with good messages are the only context you get years down the line when the original developers are gone or don't remember reasons for something, and have been a massive help so many times.

I'm fine with manual squashing to clean up those WIP commits, but a blind squash-merge should never be done. It throws away too much for no good reason.

For one quick example, code linting/formatting should always be a separate commit. A couple times I've seen those introduce bugs, and since it wasn't squashed it was trivial to see what should have happened.

  • I agree, in a job where you have no documentation and no CI, and are working on something almost as old or older than you with ancient abandoned tools like svn that stopped being relevant 20 years ago, and in a fundamentally dysfunctional company/organization that hasn't bothered to move off of dead/dying tools in the last 20 years, then you just desperately grab at anything you can possibly find to try to avoid breaking things. But there are far better solutions to all of the problems you are mentioning than trying to make people create little mini feature commits on their way to a feature.

    • It is not possible to manually document everything down to individual lines of code. You'll drive yourself crazy trying to do so (and good luck getting anyone to look at that massive mess), and that's not even counting how documentation easily falls out of date. Meanwhile, we have "git blame" designed to do exactly that with almost no effort - just make good commit messages while the context is in your head.

      CI also doesn't necessarily help here - you have to have tests for all possible edge cases committed from day one for it to prevent these situations. It may be a month or a year or several years later before you hit one of the weird cases no one thought about.

      Calling svn part of the problem is also kind of backwards - it has no bearing on the code quality itself, but I brought it up because it was otherwise forcing good practice because it doesn't allow you to erase context that may be useful later.

      Over the time I've been here we've migrated from Bugzilla to Fogbugz to Jira, from an internal wiki to ReadTheDocs to Confluence, and some of these hundreds of repos we manage started in cvs, not svn, and are now slowly being migrated to git. Guess what? The cvs->svn->git migrations are the only ones that didn't lose any data. None of the Bugzilla cases still exist and only a very small number were migrated from FogBugz to Jira. Some of the internal wiki was migrated directly to Confluence (and lost all formatting and internal links in the process), but ReadTheDocs are all gone. Commit messages are really the only thing you can actually rely on.

      1 reply →

> A well laid out history of logical changes makes reviewing complicated change sets easier. Rather than one giant wall of changes, you see a series of independent, self contained, changes that can be reviewed on their own.

But this would require hand curation? No development proceeds that way, or if it does then I would question whether the person is spending 80% of their day curating PRs unnecessarily.

I think you must be kind of senior and you can get away with just insisting that other people be less efficient and work in a weird way so you can feel more comfortable?

  • > But this would require hand curation? No development proceeds that way, or if it does then I would question whether the person is spending 80% of their day curating PRs unnecessarily.

    If you’re working on something and a piece of it is clearly self contained, you commit it and move on.

    > I think you must be kind of senior and you can get away with just insisting that other people be less efficient and work in a weird way so you can feel more comfortable?

    You can work however you like. But when it’s time to ask someone else to review your work, the onus is on you to clean it up to simplify review. Otherwise you’re saying your time is more valuable than the reviewer’s.

  • > But this would require hand curation? No development proceeds that way, or if it does then I would question whether the person is spending 80% of their day curating PRs unnecessarily.

    It's not really hand curation if you're deliberate about it from the get-go. It's certainly not eating up 80% of anyone's time.

    Structuring code and writing useful commits a skill to develop, just like writing meaningful tests. As a first step, use `git add -p` instead of `git add .` or `git commit -a`. As an analog, many junior devs will just test everything, even stuff that doesn't make a lot of sense, and then jumble them all together. It takes practice to learn how to better structure that stuff and it isn't done by writing a ton of tests and then curating them after the fact.

    > I think you must be kind of senior and you can get away with just insisting that other people be less efficient and work in a weird way so you can feel more comfortable?

    Your personal productivity should only be one consideration. The long-term health of the project (i.e., maintenance) and the impact on other people's efficiency also must be considered. And efficiency isn't limited to how quickly features ship. Someone who ships fast but makes it much harder to debug issues isn't a top performer. At least, in my experience. I'd imagine it's team, company, and segment-dependent. For OSS projects with many part-time contributors, that history becomes really important because you may not have the future ability to ask someone why they did something a particular way.

    • Aha, I see the issue here. What you seem to organize into cute little self contained 'commit's I would put on individual 'branches'.

      It is too hard for you to get someone to look at a PR, so you are packing multiple 'related' but not interdependent changes into one PR as individual commits so you can minimize the number of times you have to get someone to hit "approve", which is the limiting resource.

      In your situation then I believe your way of working is a rational adaptation, but only so far as you lack the influence to address the underlying organizational/behavioral dysfunction. We agree on the underlying need to make good messages, but where I merge 4-5 small branches per day, each squashed to one commit, you are saving them all up to get them (unnecessarily) put into a single merge commit.

      Just as "Structuring code" is a skill to develop, so is building healthy organizations.

      2 replies →

  • > No development proceeds that way,

    I do this. Also I do not spend 80% of my time doing it. It's not hard, nor is it time consuming.