2003/01/28

Binary Search Debugging

Something we had done since the last release of CityDesk somehow caused our publish times to increase by about 100%; on a particular large site we use for stress testing it had gone from about a minute to about two minutes.

The first thing I tried was a profiler: Compuware DevPartner Studio. Indeed this showed me where a lot of bottlenecks are; that data will be useful to speed up our publish times even more, but I really wanted to find the specific bug that I thought we had introduced which was slowing us down.

The next thing I tried was a method I learned from Gabi at Juno: the old binary search method. Before we started work on this release, publishing took 1’04”. Today it takes 1’57”. So I started checking out old versions of the source from CVS by date, rebuilding, and timing how long publishing took with each day’s build. Here’s what I found:

As of May 1: 1’57”
As of April 1: 1’05”
As of April 15: 1’05”
As of April 22: 1’06”
As of April 26: 1’58”
As of April 24: 1’05”
As of April 25: 1’05”

Aha! Now all I had to do was run WinDiff to compare the source tree from April 25th and April 26th, and I discovered four things that were changed that day, one of which was a function that DevPartner had told me was kind of slow, anyway. Within minutes I found the culprit — that function was originally written to cache its results because it’s often called with the same inputs, and I had inadvertently changed the cache key in one place and not another, so we were getting 100% misses instead of 99% hits. Solved! Total elapsed time to find this bug: about an hour. If your source code is much bigger than CityDesk, builds and checkouts may be slow. This is as good a reason as any to keep all your old daily builds around.

About the author.

I’m Joel Spolsky, co-founder of Trello and Fog Creek Software, and CEO of Stack Overflow. More about me.