Avoiding common binary search mistakes

Aug 5

As an interview coach, I see a lot of people really struggle with binary search. Most SWEs struggle to implement it in a bug-free way and it's usually because they don’t walk through a robust set of test cases.

Here is a little framework on how I think about it and go about testing binary search.

Missing a Test Case

It is so easy to mess up binary search by missing a test case. This usually leads to an out-of-bounds error. This is the most common mistake in binary search questions, so take your time and test that the code works. The most common test cases that get missed are:

Empty Array: An empty array is a basic edge case. Your function should handle this gracefully and return an appropriate value, such as -1 for a function expected to return an index, or False for a function determining the presence of an element.
Single Element Array: This is the smallest non-empty array. Make sure your function can handle this, whether the single element matches the target or not.
Array with Two Elements: This scenario tests if your function correctly moves the left or right pointer when there are just two elements. Both cases, where the target is and is not in the array, should be considered.
Array with Duplicates: Duplicates can complicate a binary search, especially if you are supposed to find the first or last occurrence of the target. Test your function with an array that has duplicate values.
Target is the First or Last Element: This tests whether your function correctly handles cases where the target is at the boundaries of the array.
Target is the Middle Element: This tests where you calculate the middle element in your algorithm to make sure you don't skip it on the first iteration.
Target is Not in the Array: Check how your function behaves when the target is not in the array. This could include targets that are smaller than all array elements, larger than all array elements, or fall between array elements.
Array with Negative Numbers: If the problem statement doesn't restrict the array values to be positive, test your function with negative numbers to ensure it handles them correctly.
Array with All Identical Elements: This tests whether your function can correctly report the position of an element when all elements in the array are identical, and also if it correctly reports that a target is not found when the target differs from the repeated element.
Array is Even/Odd: This ensures you land on the target and correctly calculate the midpoint regardless of even or odd array sizes.
Large Arrays: Test your function with a large array to ensure that it performs well and does not lead to any issues, such as stack overflow for recursive implementations.

Holy smokes, that's a lot of tests! At interviewing.io we have found that even after interviewees know they might be missing a test case, they aren't sure which one it is and definitely don't have time to go down such an exhaustive checklist like this, so we've come up with an easy to remember acronym that will help you cover your bases. Just ask yourself, "Is my solution FINDABLE?" This reminds you to check these key edge cases to ensure your binary search solution is robust.

First/Final: The target is the first element or last (final) element in the array.
Identical: The array contains multiple identical targets.
Non-existent: The target is not in the array.
Delete: The array contains only duplicate elements.
Alone: The array has a single element.
Binary: The array has two elements.
Large: The array is quite large.
Empty: The array is empty.

I don't explicitly call out every test case in this acronym, but when it is combined with the testing process shown below you end up covering tests that are not explicitly mentioned (i.e. we don't have a letter in the acronym that calls out testing even/odd elements or that we successfully return the middle element on the first iteration, but in following the below process we do end up testing these cases in tandem with other test cases).

At a glance this may seem like a lot of tests to run through, but we can make the process very fast by building up our tests in a specific order. Some tests may also not apply in a problem so they are also easy to skip over entirely. The ideal order to check your solution would be to build up from an empty array and work towards a large (well, large for a test by-hand case) number of four total elements like below:

*An optimal testing order for binary search question "integration" tests*

This order allows us to double up testing multiple things at the same time. We can test two elements in the array while testing the final element boundary. We test identical elements (duplicates) while testing for the target being in the middle and testing a non-trivial odd number of elements. And we can test a large four-element array while testing an even number of elements.

The first four test cases should be able to be breezed through in 30 seconds or less with the remaining four taking slightly longer than that. While this isn't an exhaustive set of test cases, they are the ones most likely to catch bugs, so you can spend less than 2 minutes on this process in total and still feel very confident that your answer is correct.

Forgetting Sorting Adds to the Time Complexity

Most candidates are familiar with the fact that standard comparison-based sorting takes O(N log N) time complexity, but forget to account for it in their solution. In the Three Sum question above, sorting the array takes just one line in the solution and candidates end up missing it in their overall time complexity analysis because it is so easy to skip over.

Logically Deduce Binary Search with Big-O

An advanced strategy for discovering the optimal solution to a problem is to discuss the time complexity with the interviewer. If you suggest a linear solution to a problem and try to confirm if it is optimal with the interviewer, you might get pushback on it or be told that "it's decent for now, maybe we can optimize later." This is actually a large hint towards binary search.

Walking through the time complexities, if we are told we can optimize a linear time algorithm, this points directly to a logarithmic algorithm and binary search is the most common algorithm with this time complexity. Being able to logically deduce binary search given the constraints of the problem shows mastery over the topic.

Discuss Edge Cases Proactively

Above we outline many edge cases for binary search. While you could go through them all in your head, you demonstrate mastery by communicating these edge cases. Checking these test cases quietly will bore your interviewer, but explaining the many test cases you are doing simultaneously shows mastery in the interview.

Michael Mroczka

Avoiding common binary search mistakes

Missing a Test Case

Forgetting Sorting Adds to the Time Complexity

Logically Deduce Binary Search with Big-O

Discuss Edge Cases Proactively

A quick thought on testing your code in interviews

Acing technical interviews - part two

Michael Mroczka

mike<at>mroczkas.com