← Back to context

Comment by iamcreasy

7 hours ago

> At that point, I reached for an age-old tool that has gotten more useful in the modern age: binary search. That is, you explain the symptom to your coding agent. Then you have it repeatedly remove stuff from your code that might be causing the problem

Can someone give me some high level pointers on how to setup this scaffolding?

When I read the first sentence, I expected the author to use `git bisect`.

However, what the author seems to have done is used a prompt with claude that probably looked something like this:

"Some piece of code is causing the page to load very slowly. To debug this, I'd like to use binary search, where we keep commenting/uncommenting 50% of the remaining code, and then I manually check if the page is still very slow. Let's start now; Comment out a component (or parts of a component) that you estimate is 50% of the page, and I will tell you if the page is still slow."

  • `git bisect` is interesting option. I haven't heard about it before. Thanks for info. Still learning something ;)

    I'm old school. I used to do "manual bisection" on git history by just `git checkout <commit_id>` until I find first introducing bug commit.

    Then another "bisection" on commit changes until minimal change found.

    Deterministic bugs are quite "fine". For me personally worst are randomly occurring bugs in specific conditions for eg. some race conditions.

I do this all the time in a dumb but effective way. Add logging statements to code paths that drop timing info. Another dumb but effective way, instead of using a step through debugger, is drop "here, value is {val}". Telling claude to do this is trivial, it's quick, and it can read its own output and self-solve the problem all with just the code itself.

IMHO git bisect is slower, especially depending on the reload/hot-reload/compile/whatever process your actual app is using.

How well agents can do this is mostly proportional to how well they can understand and navigate your codebase broadly.

There are various contributing factors to this, but they include clear docs, notes and refactors that clear up parts the agent commonly gets confused by, choosing boring technology (your dependencies are well understood) and access to command-line tools that let it lint + typecheck + test the code. A lot of the scaffolding and wiring necessary are built into Cursor and Claude Code themselves now. Hope that helps!