← Back to context

Comment by steveklabnik

21 hours ago

It's totally possible that you aren't the target audience for this sort of feature. It tends to be more useful in very large team and/or monorepo contexts.

This isn't about siloing discussions: it's about focus. You can always see the full stack if you want to go look at the other parts, the key is that you don't have to.

The goal is to get thoroughly reviewed changes. It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one. Keeping commits small and tightly reviewed leads to better outcomes in the end. Massive PRs lead to rubber stamps of +1.

I agree that that scenario sounds like a nightmare. But I don't think that a PR is the right place to solve that problem: it sounds like something that should have been sorted before any of the code was written in the first place.

> It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one.

This is true if the changes are orthogonal and are truly independent. One should always favor small independent changes if one can.

But when changes are all actually part of the same unit, and aren’t separable (apart from maybe the first of N of them which may be mergeable independently), proponents always seem to advocate that stacked diffs can somehow change this fact. “Oh if only we had stacked diffs we could break this into smaller changes”, ignoring the fact that no, they’d still be ordered and dependent on one another.

Stacked diffs seem like a UI convenience for reviewers… that’s fine I guess. GitHub is basically what you get when you ask the question “how can we make code review as tedious and unhelpful as possible”, and literally anything would be better than what we have (seriously I could fill a book with how bad GitHub is. I don’t think I could design a worse experience if I tried.) So, maybe I should just be happy they’re trying anything.

  • In stacked diffs systems, the idea is that the base of the stack (once reviewed) can always be merged independently, so you're totally right that like, if you just purely think you can split things up when they shouldn't be split up, that would be bad.

    This is the model that the kernel uses, as well as tons of other projects (any Gerrit user, for example), and so it has gotten real-world use and at scale. That said, everyone is also entitled to their preferences :)

    • > This is the model that the kernel uses

      Nah.

      The kernel uses a mailing list, and a “review” means a mailing list thread. With some nice CLI tools to integrate with git when you want to actually apply the patch (or start a review thread.)

      In that world, “[PATCH 2/5]” (or whatever) in the subject title, and a different CC list for each patch, is a nice way to be able to ensure different subsets of the patch series have different discussions. That’s great.

      But if you’re going to compare this to a GitHub UI, you have to choose the basis for comparison, because the two are so utterly different. Choosing one aspect (can we make sure discussions are kept separate), and saying “therefore the kernel uses stacked diffs” is a huge misrepresentation of how different GitHub’s approach is.

      Because the kernel approach is the platonic ideal of a code review: it’s a simple threaded discussion between stakeholders, centered around a topic (the patch, which is inlined right in the email.) I would wager close zero kernel maintainer actually look at the diffs exclusively via their email client. They probably just check out the changes locally and look at them, and the purpose of the mailing list is to facilitate focused discussion on parts of the change (which is all we really want, in the end.)

      GitHub has so thoroughly shit the bed on actually developing a good model of “threaded discussion about a change”, that you have to change the way you think about git’s model to fix how awful GitHub is at allowing review discussion to stay focused. You shouldn’t need to think about stacked diffs and multiple PR’s. You should use git branches as intended, multiple commits representing changes, and a merge meaning “this branch makes it or not.” That GitHub’s UI for discussing subsets of a change is so abysmal, does not mean the model is wrong. It means their discussion system is so abysmal that a mailing list TUI can run circles around it. Fixing this is GitHub’s problem, and doesn’t require any changes to how PR’s should be split up.

      If you have a 2500-line PR with 5 500-line commits, GitHub should not require you to split things up further in any way, just to unfuck their discussion system.

      Random idea I spent 10 seconds thinking about: let me start a “here’s a thread discussing the UI changes” and add folks to it, and “here’s a thread discussing the backend changes”, and add folks to that. I can then say “let’s not merge this until both threads are green”. You still see the whole change in the UI. (You can click directories to drill into the changes, that solves the “but the diff is too big” issue.) Discussion on a chunk of the diff is scoped to a discussion thread, which you select when sending the message. Thus, all discussion on any part of the diff is still scoped to a “discussion thread” of arbitrary subsets of stakeholders.

      None of this needs me to change how I split up my git branches, an entire logical change is still either “merged” or “not-merged” (seriously who cares about the Pyrrhic victory of merging only change 1/N), and if we want to limit scopes of discussion to subsets of a change, we can just… do that.

      4 replies →