数据结构和算法 -- 搜索

两种搜索方式,对 unordered array 用 linear search,对 ordered array 用 binary search。

二分查找 (Binary search)

概念

对于已排序的有序线性容器而言(比如数组,vector),二分查找(Binary search)几乎总是最优的搜索方案。二分查找将容器等分为两部分,再根据中间节点与待搜索数据的相对大小关系,进一步搜索其中某一部分。二分查找的算法复杂度为O(logn)。
对于局部有序的数据,也可以根据其局部有序的特性,尽可能地利用逼近、剪枝,使用二分查找的变种进行搜索。

二分寻找要注意的问题是:

  • Which way should middle pointer go next
  • Avoid infinite loop in the code

算法

Compare the number in the middle of the array with x. If it is equal, we are done. If the number is greater, we know to look in the second half of the array. If it is smaller, we know to look in the first half. We can repeat the search on the appropriate half of the array by comparing the middle element of that array with x, once again narrowing our search by a factor of 2. We repeat this process until we find x. This algorithm takes O(log n) time.

Complexity

Time complexity: O(logN)
Worst case: O(logN+1) -> O(logN)

模板

Recursive

1
2
3
4
5
6
7
8
9
10
def binarySearch(data, start, end, key):
if start > end:
return -1
mid = start + (end - start) / 2
if data[mid] == key:
return mid
if data[mid] < key:
return binarySearch(data, mid + 1, end, key)
else:
return binarySearch(data, start, mid - 1, key)

Non-recursive

1
2
3
4
5
6
7
8
9
10
11
12
13
def binarySearch(data, key):
start = 0
end = len(data) - 1
while True:
if start > end:
return -1
mid = start + (end - start) / 2
if data[mid] == key:
return mid
if data[mid] < key:
start = mid + 1
else:
end = mid - 1

Find closest value

1
2
3
4
5
6
7
8
9
def binarySearch(house, heaters):
left, right = 0, len(heaters)
while left < right:
mid = left + (right - left) / 2
if heaters[mid] < house:
left = mid + 1
else:
right = mid
return left

例题

69. Sqrt(x)

Problem

Implement int sqrt(int x).
Compute and return the square root of x.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
'''
Search n from 0 to x, every time increment 1, maintain a global varible pre to record the last n whose square is smaller than x, for each n, check if n*n>x, if so, return pre. Takes O(n) time.
Followup:
Use binary search to find correct n. Takes O(logn) time.
About overflow:
use mid=start+(end-start)/2
mid * mid will overflow when mid > sqrt(INT_MAX)
'''
class Solution(object):
'''
O(n)
def mySqrt(self, x):
"""
:type x: int
:rtype: int
"""
if x<2: return x
pre=0
for n in range(2,x):
if n*n==x: return n
if n*n<x: pre=n
else: return pre
'''
'''
binary search O(logn)
'''
def mySqrt(self, x):
"""
:type x: int
:rtype: int
"""
if not x or x < 2:
return x
start = 0
end = x
while start <= end:
mid = start + (end - start) / 2
if mid <= x / mid and x / (mid + 1) < mid + 1:
return mid
elif mid * mid < x:
start = mid + 1
else:
end = mid - 1
'''
Integer Newton
def mySqrt(self, x):
"""
:type x: int
:rtype: int
"""
if x<2: return x
r = x
while r > x/r:
r = (r + x/r) / 2
return r
'''

Find first bad version

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails the quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad.
Suppose you have n versions [1, 2, …, n] and you want to find out the first bad one, which causes all the following ones to be bad.
You are given an API bool isBadVersion(version) which will return whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# The isBadVersion API is already defined for you.
# @param version, an integer
# @return a bool
# def isBadVersion(version):
class Solution(object):
def firstBadVersion(self, n):
"""
:type n: int
:rtype: int
"""
if n==0:
return 0
start = 0
end = n
while start < end-1:
mid = start + (start - end)/2
if isBadVersion(mid):
end = mid
else:
start = mid
return start if isBadVersion(start) else end

Search a 2D Matrix

Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:
Integers in each row are sorted from left to right.
The first integer of each row is greater than the last integer of the previous row.
For example,
Consider the following matrix:
[
[1, 3, 5, 7],
[10, 11, 16, 20],
[23, 30, 34, 50]
]
Given target = 3, return true.

把这个 2D 数组拉平成 1D 就是一个单调递增的(sorted)的数组,这种情况下找一个数用 binary search 就好,时间复杂度是 O(log(mn))=O(logN),要注意的就是 index 之间怎么转换,观察发现:
2D -> 1D (i,j) -> i*n+j
1D -> 2D index -> (index/n,index%n)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution(object):
def searchMatrix(self, matrix, target):
"""
:type matrix: List[List[int]]
:type target: int
:rtype: bool
"""
if not matrix:
return False
m = len(matrix) # row
n = len(matrix[0]) # column
start = 0
end = m*n-1
while start <= end:
mid = start + (start - end) / 2
if matrix[mid/n][mid%n] == target:
return True
if matrix[mid/n][mid%n] > target:
end = mid-1
else:
start = mid+1
return False

Search a 2D Matrix II

问题再变难一点。

Search a 2D Matrix II QuestionEditorial Solution My Submissions
Total Accepted: 50080
Total Submissions: 136871
Difficulty: Medium
Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:
Integers in each row are sorted in ascending from left to right.
Integers in each column are sorted in ascending from top to bottom.
For example,
Consider the following matrix:
[
[1, 4, 7, 11, 15],
[2, 5, 8, 12, 19],
[3, 6, 9, 16, 22],
[10, 13, 14, 17, 24],
[18, 21, 23, 26, 30]
]
Given target = 5, return true.
Given target = 20, return false.

当然还是可以用 binary search 做,需要非常仔细。时间复杂度 max(O(mlogn,nlogm)),由 T(n)=2T(n/2)+cn 推导而来

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution(object):
def searchMatrix(self, matrix, target):
"""
:type matrix: List[List[int]]
:type target: int
:rtype: bool
"""
if not matrix:
return False
return self.helper(matrix,target,0,0,len(matrix)-1,len(matrix[0])-1)
def helper(self,matrix,target,startX,startY,endX,endY):
if startX>endX or startY>endY:
return False
midX=startX+(startX-endX)/2
midY=startY+(startX-endY)/2
mid = matrix[midX][midY]
if mid == target:
return True
if mid > target:
return self.helper(matrix,target,startX,midY,midX-1,endY) or self.helper(matrix,target,startX,startY,endX,midY-1)
else:
return self.helper(matrix,target,midX+1,startY,endX,midY) or self.helper(matrix,target,startX,midY+1,endX,endY)
return False

或者,不用 binary search,用比较通用的方法,很好理解,从 top rightmost 开始,比较与 target 的大小,if curr>target,往左,if currlen(matrix)-1。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Solution(object):
def searchMatrix(self, matrix, target):
"""
:type matrix: List[List[int]]
:type target: int
:rtype: bool
"""
if not matrix:
return False
i,j = 0,len(matrix[0])-1
while True:
if j<0 or i>len(matrix)-1:
return False
curr = matrix[i][j]
if curr == target:
return True
if curr > target:
j -= 1
else:
i += 1

475. Heaters

Problem

Winter is coming! Your first job during the contest is to design a standard heater with fixed warm radius to warm all the houses.

Now, you are given positions of houses and heaters on a horizontal line, find out minimum radius of heaters so that all houses could be covered by those heaters.

So, your input will be the positions of houses and heaters seperately, and your expected output will be the minimum radius standard of heaters.

Note:
Numbers of houses and heaters you are given are non-negative and will not exceed 25000.
Positions of houses and heaters you are given are non-negative and will not exceed 10^9.
As long as a house is in the heaters’ warm radius range, it can be warmed.
All the heaters follow your radius standard and the warm radius will the same.

Example 1:

1
2
3
Input: [1,2,3],[2]
Output: 1
Explanation: The only heater was placed in the position 2, and if we use the radius 1 standard, then all the houses can be warmed.

Example 2:

1
2
3
Input: [1,2,3,4],[1,4]
Output: 1
Explanation: The two heater was placed in the position 1 and 4. We need to use radius 1 standard, then all the houses can be warmed.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
'''
Solution:
Sort the heater
For each house, find out the positions of two heaters where the house is between, calculate the minimum value, update global radius
Be careful about corner case
Followup:
use binary search
'''
class Solution(object):
def findRadius(self, houses, heaters):
"""
:type houses: List[int]
:type heaters: List[int]
:rtype: int
"""
houses.sort()
heaters.sort()
def binarySearch(house, heaters):
left, right = 0, len(heaters)
while left < right:
mid = left + (right - left) / 2
if heaters[mid] < house:
left = mid + 1
else:
right = mid
return left
r = 0
for house in houses:
i = binarySearch(house, heaters)
if i == 0:
cur = abs(house - heaters[0])
elif i == len(heaters):
cur = abs(heaters[-1] - house)
else:
cur = min(abs(house - heaters[i]), abs(heaters[i - 1] - house))
r = max(r, cur)
return r
'''
class Solution(object):
def findRadius(self, houses, heaters):
"""
:type houses: List[int]
:type heaters: List[int]
:rtype: int
"""
houses.sort()
heaters.sort()
heaters=[float('-inf')]+heaters+[float('inf')] # add 2 fake heaters
r,i = 0,0
for house in houses:
while house > heaters[i+1]: # search to put house between heaters
i +=1
cur = min (house - heaters[i], heaters[i+1]- house)
r = max(r,cur)
return r
'''
徐阿衡 wechat
欢迎关注:徐阿衡的微信公众号
客官,打个赏呗~