Bug Free的二分查找

你能写出Bug Free的二分查找算法吗?

简介

在连续(且往往是有序的)的序列中搜索指定元素。一般流程是定义起始搜索区间,找到区间的中值,通过不断比较区间中值和目标元素,对半划分调整下一轮的搜索区间,直到找到目标元素或者区间不能被进一步划分。

尽管二分查找原理很容易理解,但需要考虑到许多corner cases,例如搜索区间的终止条件,区间中值的溢出问题等等,因此写出bug-free的二分查找有一定难度。此外,二分查找的思想在不同的问题情景下有相应变种的使用方法,需要具体问题具体分析。

在计算区间中值的时候,始终要考虑溢出的情况。此处的溢出指的是采用mid = (first + last) / 2运算。最常见处理溢出的手段就是直接将这个等式进行变形:

mid = (first + last) / 2
    = (2 * first + last - first) / 2
    = first + (last - first) / 2

此处的除号代表整除。在有的场景下,如果想让中值取值更偏左一些,可以考虑求下位中位数,即mid = first + (last - first - 1) / 2。不过在下文的模型介绍中,我们仍然统一采用标准的中位数取值。

模型1

介绍

最常使用的模型,条件清晰,不易出错。如果问题场景中可以访问数组中的每一个下标所对应的元素,并且侧重于判断“某个元素是否存在于数组中”,但对于具体存在的下标位置不那么关注,就可以使用它:

public int search(int[] nums, int target) {
    if (nums == null || nums.length == 0) {
        return -1;
    }

    int left = 0, right = nums.length - 1;
    while (left <= right) { 
        int mid = left + (right - left) / 2;

        if (nums[mid] == target) {
            return mid;
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else if (nums[mid] > target) {
            right = mid - 1;
        }
    }

    return -1;
}

这个模型有三个特点:

  • target值和nums[mid]的比较分为了三种情况,严格一一对应了target小于nums[mid]target等于nums[mid]target大于nums[mid]三种区间处理方式,最符合人的直觉。换言之,这个模型采用了 “左闭右闭”的原则,mid的两侧的区间分别是[left, mid - 1][mid + 1, right]
  • 区间的左、右边界是允许重合的,即有可能出现left <= right的情况,此时搜索区间中只有一个数
  • 循环结束时,一定有left > right,这意味着一个无效的搜索区间。如果在此之前没有找到target值,那么可以说明target一定不在数组中

sqrt(x)

Given a non-negative integer x, compute and return the square root of x.

Since the return type is an integer, the decimal digits are truncated, and only the integer part of the result is returned.

Example 1:
Input: x = 4
Output: 2


Example 2:
Input: x = 8
Output: 2
Explanation: The square root of 8 is 2.82842..., and since the decimal part is truncated, 2 is returned.

相当于给定0到x的数组,找到sqrt(x)或者是小于sqrt(x)的最大整数。直接套用模型,如果sqrt(x)存在的话,将其找出;如果不存在的话,由于最后left > right,因此right是最接近且小于sqrt(x)的整数。

public int mySqrt(int x) {
    long left = 0, right = x, target = x;
    while (left <= right) {
        long mid = left + (right - left) / 2;
        if (mid * mid == target) {
            return (int)mid;
        } else if (mid * mid < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }
    return (int)right;
}

Guess Number Higher or Lower

We are playing the Guess Game. The game is as follows: I pick a number from 1 to n. You have to guess which number I picked. Every time you guess wrong, I will tell you whether the number I picked is higher or lower than your guess. You call a pre-defined API int guess(int num), which returns 3 possible results:

  • -1: The number I picked is lower than your guess (i.e. pick < num).
  • 1: The number I picked is higher than your guess (i.e. pick > num).
  • 0: The number I picked is equal to your guess (i.e. pick == num).

Return the number that I picked.

Example 1:
Input: n = 10, pick = 6
Output: 6


Example 2:
Input: n = 1, pick = 1
Output: 1


Example 3:
Input: n = 2, pick = 1
Output: 1


Example 4:
Input: n = 2, pick = 2
Output: 2

题干关于guess API的描述,几乎直接对应了模型的三种情形。需要注意,此题的设定使得一定能找到pick的数,换言之代码一定能在while循环中返回到最终结果,所以while循环之后的后续处理中(找不到target数的情况),随便返回任意数都无所谓。

public class Solution extends GuessGame {
    public int guessNumber(int n) {
        int left = 1, right = n;
        while (left <= right) {
            int mid = left + (right - left) / 2;
            if (guess(mid) == 0) {
                return mid;
            } else if (guess(mid) == 1) {
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
        
        return -1;
    }
}

Search in Rotated Sorted Array

There is an integer array nums sorted in ascending order (with distinct values).

Prior to being passed to your function, nums is rotated at an unknown pivot index k (0 <= k < nums.length) such that the resulting array is [nums[k], nums[k+1], ..., nums[n-1], nums[0], nums[1], ..., nums[k-1]] (0-indexed). For example, [0,1,2,4,5,6,7] might be rotated at pivot index 3 and become [4,5,6,7,0,1,2].

Given the array nums after the rotation and an integer target, return the index of target if it is in nums, or -1 if it is not in nums.

Example 1:
Input: nums = [4,5,6,7,0,1,2], target = 0
Output: 4


Example 2:
Input: nums = [4,5,6,7,0,1,2], target = 3
Output: -1


Example 3:
Input: nums = [1], target = 0
Output: -1

这一题需要将数组对半划分来看,其中一半一定是有序的,而另一半则在中间某处破坏了升序。因此在使用模型1时,我们无法简单通过比较nums[mid]target确定下一轮搜索区间,而是应该比较nums[left]nums[mid]nums[right]三者的值,先判断哪一半数组是有序的;之后将target放进有序的这一侧进行比较,如果target的值介于其中则把区间定为有序的这一侧,反之定为另一侧。

class Solution {
    public int search(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }

        int left = 0, right = nums.length - 1;
        while (left <= right) {
            int mid = left + (right - left) / 2;

            // nums[mid] matches with target, return
            if (nums[mid] == target) {
                return mid;
            }

            // left half is ordered
            if (nums[left] <= nums[mid]) {
                // target is in left half
                if (target >= nums[left] && target <= nums[mid]) {
                    right = mid - 1;
                } else {
                    left = mid + 1;
                }
            }

            // right half is ordered
            if (nums[mid] <= nums[right]) {
                // target is in right half
                if (target >= nums[mid] && target <= nums[right]) {
                    left = mid + 1;
                } else {
                    right = mid - 1;
                }
            }
        }
        
        return -1;
    }
}

模型2

介绍

模型1的写法在某些场景下有缺陷。例如给定nums = [1,2,2,2,3]target = 2,通过模型1得出的target下标就是2。然而在一些场合下,我们更关心的是targetnums中“第一次”,或者“从左到右最初”出现的下标位置,那么下标应该返回1才对。因此,我们需要对模型1进行修改,不仅能适应普通的target查找问题,还能适应这种边界场景下的二分查找:

public int search(int[] nums, int target) {
    if (nums == null || nums.length == 0) {
        return -1;
    }

    int left = 0, right = nums.length;
    while (left < right) { 
        int mid = left + (right - left) / 2;
        if (nums[mid] < target) {
            left = mid + 1;
        } else if (nums[mid] >= target) {
            right = mid;
        }
    }
	
    // return left;
    
    if (left != nums.length && nums[left] == target) {
        return left;
    }
    return -1;
}

这个模型有几个特点:

  • target值和nums[mid]的比较分为了两种情况,其中left = mid + 1的情况与模型1相同;但right = mid与此前不同,其原因是搜索的策略发生了变化,我们不止希望“找到target”,更希望在找到target之后“尽可能扩张区间”,即我们假定target可能多次出现,所以不立即返回当前下标,而是缩小搜索区间的上界,向左扩张
  • 搜索区间的原则使用“左闭右开”,即[left, right)。这是为了更好契合“边界上第一次出现”的概念,顺便也让代码中可以更少出现+1或-1的边界值检查。根据左闭右开,mid两侧的区间应该写成[left, mid)[mid+1, right);同理初始的right值也应该是nums.length而非nums.length - 1
  • 循环结束时,一定有left == right。这也是因为采取了左闭右开,因为当left == right时,搜索区间变成了[left, left),区间的实际长度变成0,搜索也就终止了
  • 如果需要找出target第一次出现的位置,直接返回left即可(right也行,反正此时二者相等);而如果需要判断target是否在nums中出现,如果在此之前没有找到target值,则需要额外判断最后这个nums[left]是否与target值相等

类似地,如果要进行右边界场景下的二分查找,修改代码如下:

public int search(int[] nums, int target) {
    if (nums == null || nums.length == 0) {
        return -1;
    }

    int left = 0, right = nums.length;
    while (left < right) { 
        int mid = left + (right - left) / 2;
        if (nums[mid] <= target) {
            left = mid + 1;
        } else if (nums[mid] > target) {
            right = mid;
        }
    }
	
    // return left - 1;
    
    if (left != 0 && nums[left - 1] == target) {
        return left - 1;
    }
    return -1;
}

和左边相比,主要的两处差异在于:

  • 找到target时,不急于返回,而是向右扩大搜索区间,因此当nums[mid] == target时,要增大搜索区间的下界,向右扩张
  • 最后返回的下标是left - 1right - 1也行,反正此时二者相等),这是因为我们的区间更新策略是让left = mid + 1,最后循环结束时nums[left]一定已经不等于target了,需要倒退一位

First Bad Version

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails the quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad.

Suppose you have n versions [1, 2, ..., n] and you want to find out the first bad one, which causes all the following ones to be bad.

You are given an API bool isBadVersion(version) which returns whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

Example 1:
Input: n = 5, bad = 4
Output: 4
Explanation:
call isBadVersion(3) -> false
call isBadVersion(5) -> true
call isBadVersion(4) -> true
Then 4 is the first bad version.


Example 2:
Input: n = 1, bad = 1
Output: 1

问题需要找出最小的bad version,正好对应左边界场景下的二分查找问题。如果某个版本号不是bad version,就向上搜索版本号;反之向下搜索,看是否还有更小的版本号也是bad version。

/* The isBadVersion API is defined in the parent class VersionControl.
      boolean isBadVersion(int version); */

public class Solution extends VersionControl {
    public int firstBadVersion(int n) {
        int left = 1, right = n;
        
        while (left < right) {
            int mid = left + (right - left) / 2;
            if (isBadVersion(mid) == false) {
                left = mid + 1;
            } else {
                right = mid;
            }
        }
        
        return left;
    }
}

Find First and Last Position of Element in Sorted Array

Given an array of integers nums sorted in ascending order, find the starting and ending position of a given target value. If target is not found in the array, return [-1, -1].

Follow up: Could you write an algorithm with O(log n) runtime complexity?

Example 1:
Input: nums = [5,7,7,8,8,10], target = 8
Output: [3,4]


Example 2:
Input: nums = [5,7,7,8,8,10], target = 6
Output: [-1,-1]


Example 3:
Input: nums = [], target = 0
Output: [-1,-1]

分别进行左边界搜索和右边界搜索即可。

class Solution {
    public int[] searchRange(int[] nums, int target) {
        int[] result = new int[] {-1, -1};
        if (nums == null || nums.length == 0) {
            return result;
        }

        // left
        int left = 0, right = nums.length;
        while (left < right) { 
            int mid = left + (right - left) / 2;
            if (nums[mid] < target) {
                left = mid + 1;
            } else if (nums[mid] >= target) {
                right = mid;
            }
        }
        
        if (left != nums.length && nums[left] == target) {
            result[0] = left;
        }

        // right
        left = 0;
        right = nums.length;
        while (left < right) { 
            int mid = left + (right - left) / 2;
            if (nums[mid] <= target) {
                left = mid + 1;
            } else if (nums[mid] > target) {
                right = mid;
            }
        }
        
        if (left != 0 && nums[left - 1] == target) {
            result[1] = left - 1;
        }

        return result;
    }
}

Find Minimum in Rotated Sorted Array

Suppose an array of length n sorted in ascending order is rotated between 1 and n times. For example, the array nums = [0,1,2,4,5,6,7] might become:

  • [4,5,6,7,0,1,2] if it was rotated 4 times.
  • [0,1,2,4,5,6,7] if it was rotated 7 times.

Notice that rotating an array [a[0], a[1], a[2], ..., a[n-1]] 1 time results in the array [a[n-1], a[0], a[1], a[2], ..., a[n-2]]. Given the sorted rotated array nums of unique elements, return the minimum element of this array.

Example 1:
Input: nums = [3,4,5,1,2]
Output: 1
Explanation: The original array was [1,2,3,4,5] rotated 3 times.


Example 2:
Input: nums = [4,5,6,7,0,1,2]
Output: 0
Explanation: The original array was [0,1,2,4,5,6,7] and it was rotated 4 times.


Example 3:
Input: nums = [11,13,15,17]
Output: 11
Explanation: The original array was [11,13,15,17] and it was rotated 4 times.

这一题借鉴了此前Search in Rotated Sorted Array的数组对半划分思想,通过比较nums[left]nums[mid]nums[right]确定下一轮的搜索区间。如果nums[mid]不小于边界两个值,说明[left, mid]是一段递增序列,最小值在[mid + 1, right]中;反之,最小值在[left, mid]中。因为我们希望找到数组最小值,根据左边界查找思想,在确定最小值所在范围后我们仍然希望尽量向左扩展范围,试图能在范围中找到更小的值符合要求。代码主干可以写成:

public int findMin(int[] nums) {
    if (nums == null || nums.length == 0) {
        return 0;
    }
    if (nums.length == 1) {
        return nums[0];
    }

    int left = 0, right = nums.length - 1;
    while (left < right) {
        int mid = left + (right - left) / 2;

        // if (mid > 0 && nums[mid] < nums[mid-1]) {
        //     return nums[mid];
        // }

        if (nums[left] <= nums[mid] && nums[right] < nums[mid]) {
            left = mid + 1;
        } else {
            right = mid;
        }
    }

    return nums[left];
}

注意代码注释的部分实际上与二分查找算法本身无关,只是用于辅助加速的手段,快速检查mid值是否就是最小值。