At the end you use `git bisect`

Posted by _spaceatom 3 days ago

At the end you use `git bisect`(kevin3010.github.io)

230 points | 169 commentspage 2

rf15 3 days ago|

Honestly, after 20 years in the field: optimising the workflow for when you can already reliably reproduce the bug seems misapplied because that's the part that already takes the least amount of time and effort for most projects.

nixpulvis 3 days ago||

Just because you can reproduce it doesn't mean you know what is causing it. Running a bisect to fix which commit introduces it will reduce the area you need to search for the cause.

SoftTalker 3 days ago||

I can think of only a couple of cases over 20+ years where I had to bisect the commit history to find a bug. By far the normal case is that I can isolate it to a function or a query or a class pretty quickly. But most of my experience is with projects where I know the code quite well.

cloud8421 3 days ago|||

I think your last sentence is the key point - the times I've used bisect have been related to code I didn't really know, and where the knowledgeable person was not with the company more or on holiday.

nixpulvis 3 days ago|||

Exactly. And even if I do know the source pretty well, that doesn't mean I'm caught up on all the new changes coming in. It's often a lot faster to bisect than to read the log over the month or two since I touched something.

SoftTalker 2 days ago||||

Even so, normally anything like a crash or fatal error is going to give you a log message somewhere with a stack dump that will indicate generally where the error happened if not the exact line of code.

For more subtle bugs, where there's no hard error but something isn't doing the right thing, yes bisect might be more helpful especially if there is a known old version where the thing works, and somewhere between that and the current version it was broken.

hinkley 2 days ago|||

Or they were barking up a wrong tree and didn’t know it yet, and the rest of us were doing parallel discovery.

Tick tock. You need competence in depth when you have SLIs.

wyldfire 2 days ago||||

> By far the normal case is that I can isolate it to a function or a query or a class pretty quickly

In general, this takes human-interactive time. Maybe not much, but generally more interactive time than is required to write the bisect test script and invoke `git bisect run ...`

The fact that it's noninteractive means that you can do other work in the meantime. Once it's done you might well have more information than you'd have if you had used the same time manually reducing it interactively by trying to reduce the scope of the bug.

hinkley 2 days ago|||

I’ve needed CPR zero times and bisect around a dozen. You should know both particularly for emergencies.

hinkley 2 days ago|||

I would add to nixpulvis’s comments that git history may also help you find a repro case, especially if you’ve only found a half-assed repro case that is overly broad.

Before you find even that, your fire drill strategy is very very important. Is there enough detail in the incident channel and our CD system for coworkers to put their dev sandbox in the same state as production? Is there enough if a clue of what is happening for them to run speculative tests in parallel? Is the data architecture clean enough that your experiments don’t change the outcome of mine? Onboarding docs and deployment process docs, if they are tight, reduce the Amdahl’s Law effect as it applies to figuring out what the bug is and where it is. Which is I. This context also Brooks ‘s Law.

zeroonetwothree 2 days ago||

Eh not always. If you work in a big codebase with 1000s of devs then it can quite tricky to find the cause of some bug when it’s in some random library someone changed for a different reason.

eru 2 days ago||

> People rant about having to learn algorithmic questions for interviews. I get it — interview system is broken, but you ought to learn binary search at least.

Well, the example of git bisect tells you that you should know of the concept of binary search, but it's not a good argument for having to learn how to implement binary search.

Also just about any language worth using has binary search in the standard library (or as a third party library) these days. That's saner than writing your own, because getting all the corner cases right is tricky (and writing tests so they stay right, even when people make small changes to the code over time).

Arcuru 2 days ago||

Unfortunately I can't find the reference now, but I remember reading that even though binary search was first described in the 1940's, the first bug-free implementation wasn't published until the 1960s.

The most problematic line that most people seem to miss is in the calculation of the midpoint index. Using `mid = (low + high) / 2` has an overflow bug if you're not using infinite precision, but there are several other potential problems even in the simplest algorithm.

kragen 2 days ago|||

The overflow bug wasn't fixed until the 21st century; the comment you remember reading dates from before it was discovered.

To be fair, in most computing environments, either indices don't overflow (Smalltalk, most Lisps) or arrays can never be big enough for the addition of two valid array indices to overflow, unless they are arrays of characters, which it would be sort of stupid to binary search. It only became a significant problem with LP64 and 64-bit Java.

eru 2 days ago||

Agreed.

Your comment is mostly true, when you do binary search in something like an array, yes.

But you can also do binary search in any monotonically increasing function.

kragen 1 hour ago||

For arbitrary functions you usually want floating point, which solves the overflow problem in a different way.

eru 2 days ago|||

Agreed.

> Using `mid = (low + high) / 2` has an overflow bug if you're not using infinite precision, but there are several other potential problems even in the simplest algorithm.

Well, if you are doing binary search on eg items you actually hold in memory (or even disk) storage somewhere, like items in a sorted array (or git commits), then these days with 64 bit integers the overflow isn't a problem: there's just not enough storage to get anywhere close to overflow territory.

A back of the envelope calculation estimates that we as humanity have produced enough memory and disk storage in total that we'd need around 75 bits to address each byte independently. But for a single calculation on a single computer 63 bits are probably enough for the foreseeable future. (I didn't go all the way to 64 bits, because you need a bit of headroom, so you don't run into the overflow issues.)

runeblaze 2 days ago||

My personal mantra (that I myself cannot uphold 100%) is that every dev should at least do the exercise of implementing binary search from scratch in a language with arbitrary-precision integers (e.g., Python) once in a while. It is the best exercise in invariant-based thinking, useful for software correctness at large

eru 2 days ago||

Yes, it's a simple enough algorithm to be a good basic exercise---most people come up with binary search on their own spontaneously when looking a word up in dictionary.

Property based testing is really useful for finding corner cases in your binary search. See eg https://fsharpforfunandprofit.com/series/property-based-test... for one introduction.

tarwich 2 days ago||

When I learned about git bisect I thought it was a little uppity. I thought it was something I would never use in a practical scenario. Working on large code bases. However, sometimes a bug pops up and we don't know when it started. We use git bisect not place blame on a person, but to try to figure out when the bug was no longer there so we know what code introduced it. Yes, clean code helps. Sometimes git bisect is really nice to have.

utopiah 2 days ago||

I agree with the post.

I also think that typically if you have to resort to bisect you are probably in a wrong place. You should have found the bug earlier so if do not even know when the bug came from

- your test coverage isn't good sufficient

- your tests are probably not actually testing what you believe they do

- your architecture is complex, too complex for you

To be clear though I do include myself in this abstract "you".

imiric 2 days ago|

I mean, sure—in a perfect world bugs would be caught by tests before they're even deployed to production.

But few of us have the privilege of working on such codebases, and with people who have that kind of discipline and quality standards.

In reality, most codebases have statement coverage that rarely exceeds 50%, if coverage is tracked at all; tests are brittle, flaky, difficult to maintain, and likely have bugs themselves; and architecture is an afterthought for a system that grew organically under deadline pressure, where refactors are seen as a waste of time.

So given that, bisect can be very useful. Yet in practice it likely won't, since usually the same teams that would benefit from it, don't have the discipline to maintain a clean history with atomic commits, which is crucial for bisect to work. If the result is a 2000-line commit, you still have to dig through the code to find the root cause.

kfarr 3 days ago||

Wow and here I was doing this manually all these years.

gegtik 2 days ago||

git bisect gets interesting when API signatures change over a history - when this does happen, I find myself writing version-checking facades to invoke the "same" code in whatever way is legal

inamberclad 2 days ago||

'git bisect run' is probably one of the most important software tools ever.

lloydatkinson 3 days ago||

I’ve used bisect a couple of times but really it’s a workaround for having a poor process. Automatic unit tests, CI/CD, should have caught it first.

It’s still very satisfying to watch run though, especially if you write a script that it can run automatically (based on the existing code) to determine if it’s a good or bad commit.

nixpulvis 3 days ago||

It's not a workaround. In this case it seems like it, but in general you cannot always rely on your existing tests covering everything. The test you run in the bisect is often updated to catch something new which is reported. The process is often:

1. Start with working code

2. Introduce bug

3. Identify bug

4. Write a regression test

5. Bisect with new test

In many cases you can skip the bisect because the description of the bug makes it clear where the issue is, but not always.

Izkata 3 days ago||

Important addendum to 4 that can throw someone their first time - Put the new test in a new file and don't commit it to the repo yet. You don't want it to disappear or conflict with old versions of the test file when bisect checks old commits.

nixpulvis 3 days ago|||

I've always liked having regression tests somewhat isolated anyway, so this works well with that.

lloydatkinson 3 days ago|||

This is one annoying footgun. It would be great if git could ignore some special .bisect directory during the entire process. This way the script doesn’t need a load of ../..

trenchpilgrim 3 days ago|||

Create a .bisect directory and stick a gitignore inside it that ignores the folder. Or, add .bisect/ to a global gitignore file.

1718627440 2 days ago|||

You can checkout the bisect script commit in another directoy. Or use $git bisect run $(git show ...).

masklinn 3 days ago|||

> Automatic unit tests, CI/CD, should have caught it first.

Tests can't prove the absence of bugs, only their presence. Bugs or regressions will slip through.

Bisect is for when that happens and the cause is not obvious.

slang800 3 days ago|||

Sometimes you notice a problem that your unit tests didn't cover and want to figure out where it was introduced. That's where git bisect shines.

You can go back and efficiently run a new test across old commits.

tmoertel 3 days ago|||

I don't think it's that simple. For example: Show me the unit tests and CI/CD scripts you would write to prove your code is free from security holes.

Yet, once you've identified a hole, you can write a script to test for it, run `git biset` to identify what commit introduced the hole, and then triage the possible fallout.

lucasoshiro 2 days ago|||

Ideally, we should write bug-free code, but we can't. There are some tools to avoid bugs, tests are one of them. Those tools avoid them, but not mitigate. Bisect doesn't replace tests, it only helps find where the bugs are happening. After finding and fixing the bugs, it's a good idea to write a test covering that bug.

To sum up: bisect and tests are not in opposite sides, they complement each other

trenchpilgrim 3 days ago|||

"We write unit tests so our code doesn't have bugs."

"What if the unit tests have bugs?"

anthomtb 3 days ago||

Binary searching your commit history and using version control software to automate the process just seems so...obvious?

I get that author learned a new-to-him technique and is excited to share with the world. But to this dev, with a rapidly greying beard, the article has the vibe of "Hey bro! You're not gonna believe this. But I just learned the Pope is catholic."

Espressosaurus 3 days ago||

Seriously.

Binary search is one of the first things you learn in algorithms, and in a well-managed branch the commit tree is already a sorted straight line, so it's just obvious as hell, whether or not you use your VCS to run the bisect or you do it by hand yourself.

"Hey guys, check it out! Water is wet!"

PaulDavisThe1st 2 days ago||

ObXKCD: https://xkcd.com/1053/

I mean, do you really not know this XKCD?

More comments...