Comment by nickjj

6 days ago

For #4 (divide and conquer), I've found `git bisect` helps a lot. If you have a known good commit and one of dozens or hundreds of commits after that is bad, this can help you identify the bad commit / code in a few steps.

Here's a walk through on using it: https://nickjanetakis.com/blog/using-git-bisect-to-help-find...

I jumped into a pretty big unknown code base in a live consulting call and we found the problem pretty quickly using this method. Without that, the scope of where things could be broken was too big given the context (unfamiliar code base, multiple people working on it, only able to chat with 1 developer on the project, etc.).

"git bisect" is why I maintain the discipline that all commits to the "real" branch, however you define that term, should all individually build and pass all (known-at-the-time) tests and generally be deployable in the sense that they would "work" to the best of your knowledge, even if you do not actually want to deploy that literal release. I use this as my #1 principle, above "I should be able to see every keystroke ever written" or "I want every last 'Fixes.' commit" that is sometimes advocated for here, because those principles make bisect useless.

The thing is, I don't even bisect that often... the discipline necessary to maintain that in your source code heavily overlaps with the disciplines to prevent code regression and bugs in the first place, but when I do finally use it, it can pay for itself in literally one shot once a year, because we get bisect out for the biggest, most mysterious bugs, the ones that I know from experience can involve devs staring at code for potentially weeks, and while I'm yet to have a bisect that points at a one-line commit, I've definitely had it hand us multiple-day's-worth of clue in one shot.

If I was maintaining that discipline just for bisect we might quibble with the cost/benefits, but since there's a lot of other reasons to maintain that discipline anyhow, it's a big win for those sorts of disciplines.

  • Sometimes you'll find a repo where that isn't true. Fortunately, git bisect has a way to deal with failed builds, etc: three-value logic. The test program that git bisect runs can return an exit value that means that the failure didn't happen, a different value that means that it did, or a third that means that it neither failed nor succeeded. I wrote up an example here:

    https://speechcode.com/blog/git-bisect

  • I do bisecting almost as a last resort. I've used it when all else fails only a few times. Especially as I've never worked on code where it was very easy to just build and deploy a working debug system from a random past commit.

    Edit to add: I will study old diffs when there is a bug, particularly for bugs that seem correlated with a new release. Asking "what has changed since this used to work?" often leads to an obvious cause or at least helps narrow where to look. Also asking the person who made those changes for help looking at the bug can be useful, as the code may be more fresh in their mind than in yours.

  • > why I maintain the discipline that all commits to the "real" branch, however you define that term, should all individually build and pass all (known-at-the-time) tests and generally be deployable in the sense that they would "work" to the best of your knowledge, even if you do not actually want to deploy that literal release

    You’re spot on.

    However it’s clearly a missing feature that Git/Mercurial can’t tag diffs as “passes” or “bisectsble”.

    This is especially annoying when you want to merge a stack of commits and the top passes all tests but the middle does not. It’s a monumental and valueless waste of time to fix the middle of the stack. But it’s required if you want to maintain bisectability.

    It’s very annoying and wasteful. :(

  • Same. Every branch apart from the “real” one and release snapshots is transient and WIP. They don’t get merged back unless tests pass.

Back in the 1990s, while debugging some network configuration issue a wiser older colleague taught me the more general concept that lies behind git bisect, which is "compare the broken system to a working system and systematically eliminate differences to find the fault." This can apply to things other than software or computer hardware. Back in the 90s my friend and I had identical jet-skis on a trailer we shared. When working on one of them, it was nice to have its twin right there to compare it to.

The principle here "bisection" is a lot more general than just "git bisect" for identifying ranges of commits. It can also be used for partitioning the space of systems. For instance, if a workflow with 10 steps is broken, can you perform some tests to confirm that 5 of the steps functioned correctly? Can you figure out that it's definitely not a hardware issue (or definitely a hardware issue) somewhere?

This is critical to apply in cases where the problem might not even be caused by a code commit in the repo you're bisecting!

Not to complain about bisect, which is great. But IMHO it's really important to distinguish the philosophy and mindspace aspect to this book (the "rules") from the practical advice ("tools").

Someone who thinks about a problem via "which tool do I want" (c.f. "git bisect helps a lot"[1]) is going to be at a huge disadvantage to someone else coming at the same decisions via "didn't this used to work?"[2]

The world is filled to the brim with tools. Trying to file away all the tools in your head just leads to madness. Embrace philosophy first.

[1] Also things like "use a time travel debugger", "enable logging", etc...

[2] e.g. "This state is illegal, where did it go wrong?", "What command are we trying to process here?"

  • I've spent the past two decades working on a time travel debugger so obviously I'm massively biassed, but IMO most programmers are not nearly as proficient in the available debug tooling as they should be. Consider how long it takes to pick up a tool so that you at least have a vague understanding of what it can do, and compare to how much time a programmer spends debugging. Too many just spend hour after hour hammering out printf's.

    • I find the tools are annoyingly hard to use, particularly when a program is using a build system you aren't familiar with. I love time travelling debuggers, but I've also lost hours to getting large java, or C++ programs, into any working debuggers along with their debugging symbols (for C++).

      This is one area where I've been disappointed by rust, they cleaned up testing, and getting dependencies, by getting them into core, but debugging is still a mess with several poorly supported cargo extensions, none of which seem to work consistently for me (no insult to their authors, who are providing something better than nothing!)

    • Different mindsets, I find many problems easier to reason about from traces.

      It's very rare for me to use debuggers.

You can also use divide and conquer when dealing with a complex system.

Like, traffic going from A to B can turn ... complicated with VPNs and such. You kinda have source firewalls, source routing, connectivity of the source to a router, routing on the router, firewalls on the router, various VPN configs that can go wrong, and all of that on the destination side as well. There can easily be 15+ things that can cause the traffic to disappear.

That's why our runbook recommends to start troubleshooting by dumping traffic on the VPN nodes. That's a very low-effort, quick step to figure out on which of the six-ish legs of the journey drops traffic - to VPN, through VPN, to destination, back to VPN node, back through VPN, back to source. Then you realize traffic back to VPN node disappears and you can dig into that.

And this is a powerful concept to think through in system troubleshooting: Can I understand my system as a number of connected tubes, so that I have a simple, low-effort way to pinpoint one tube to look further into?

As another example, for many services, the answer here is to look at the requests on the loadbalancer. This quickly isolates which services are throwing errors blowing up requests, so you can start looking at those. Or, system metrics can help - which services / servers are burning CPU and thus do something, and which aren't? Does that pattern make sense? Sometimes this can tell you what step in a pipeline of steps on different systems fails.

git bisect is an absolute power feature everybody should be aware of. I use it maybe once or twice a year at most but it's the difference between fixing a bug in an hour vs spending days or weeks spinning your wheels

Bisection is also useful when debugging css.

When you don't know what is breaking that specific scroll or layout somewhere in the page, you can just remove half the DOM in the dev tools and check if the problem is still there.

Rinse and repeat, it's a basic binary search.

I am often surprised that leetcode black belts are absolutely unable to apply what they learn in the real world, neither in code nor debugging which always reminds me of what a useless metric to hire engineers it is.

Binary search rules. Being systematic about dividing the problem in half, determining which half the issue is in, and then repeating applies to non software problems quite well. I use the strategy all the time while troubleshooting issue with cars, etc.