算法导论(2) 排序与树

目录

插入排序

归并排序

优先级队列

堆

二叉搜索树

AVL树

比较排序的时间复杂度下界

计数排序

基数排序

待补充
其他排序算法, 如快速排序, 希尔排序等.
其他平衡树, 如红黑树, B-树等

插入排序(insertion sort)#

对于一个数组A, 指针key用来遍历数组A. 指针key每次移动后都要保证A[0 + 1]已排序. 通过将key处的数字前移到正确位置来实现. 当key到达数组末尾时, A[0]排序完毕, 即整个数组排序完毕.

插入操作#

插入排序

时间复杂度
移动需要O(n^2) 比较操作的时间复杂度取决于key处的数字前移的实现方式:

如果使用比较并交换相邻数字来实现, 那么比较操作的时间复杂度为 $O(n^2)$ .

如果使用二分查找来实现, 那么比较的时间复杂度为 $O(n\log n)$

如果比较操作的开销远大于移动的开销, 那么使用二分查找的时间复杂度将会更优. 如果比较操作开销与移动操作开销相等, 那么时间复杂度均为O(n^2).

代码实现#

1
def insertion_sort_swap(A):
2
    """
3
    对数组A进行插入排序, 使用交换实现.
4
    一种O(n^2)的原地稳定排序.
5
    """
6
    for key in range(len(A)):
7
        val = A[key]
8
        # 将A[key]处的数插入至A[0:key - 1]中正确的位置
9
        i = key - 1
10
        while i > -1 and A[i] > val:
11
            A[i + 1] = A[i]
12
            i = i - 1
13
        A[i + 1] = val
14
    return A

1
def insertion_sort_binary(A):
2
    """
3
    对数组A进行插入排序, 使用二分实现.
4
    一种O(n^2)的原地稳定排序.
5
    在比较开销显著大于移动开销时, 具有更好的时间复杂度.
6
    """
7
    # 共n次
8
    for key in range(len(A)):
9
        val = A[key]
10
        del A[key]
11
        # 将A[key]处的数插入至A[0:key - 1]中正确的位置
12
        l, r = -1, key
13
        mid = (l + r) // 2
14
        # 二分查找, 单次O(log n)
15
        while not r == l + 1:
16
            if A[mid] <= val:
17
                l = mid
18
            else:
19
                r = mid
20
            mid = (l + r) // 2
21
        # 插入, 单次O(n)
22
        A.insert(mid + 1, val)
23
    return A

归并排序(merge sort)#

分治思想.

基础情况:#

若数组只有1个元素, 返回原数组.

递归情况:#

将数组划分成左数组L与右数组R
对左数组L和右数组R分别排序
合并排序后的L和R.

合并操作:#

合并操作

复杂度分析:#

划分需要 $O(\log n)$
递归需要 $O(n)$
合并需要:
1. 第一层需要 $O(n)$
2. 第二层需要 $2O(n/2)$ , 即 $O(n)$
3. 第三层需要 $4O(n/2)$ , 即 $O(n)$
4. … 故合并需要的时间复杂度为n*层数, 即 $O(n\log n)$

代码实现#

1
def merge(A, B):
2
    """
3
    归并排序对应的合并操作.
4
    开销O(n).
5
    """
6
    L = []
7
    key_a = key_b = 0
8
    while True:
9
        # 空列表判定
10
        if key_a >= len(A):
11
            L.extend(B[key_b:])
12
            return L
13
        elif key_b >= len(B):
14
            L.extend(A[key_a:])
15
            return L
16

17
        # 合并
18
        if A[key_a] > B[key_b]:
19
            L.append(B[key_b])
20
            key_b += 1
21
        else:                   # 保证排序稳定性
22
            L.append(A[key_a])
23
            key_a += 1
24

25
def merge_sort(A):
26
    """
27
    对数组A进行归并排序.
28
    一个时间复杂度O(nlog n)的稳定非原地排序.
29
    空间开销为O(n)
30
    """
31
    # base case
32
    if len(A) <= 1:
33
        return A
34

35
    # recursion
36
    mid = len(A) // 2
37
    L = A[0:mid]
38
    R = A[mid:len(A)]
39
    L = merge_sort(L)
40
    R = merge_sort(R)
41
    return merge(L, R)

树形递归的时间复杂度

每次递归需要 $O(n)$ :

$T(n) = 2T(n/2) + cn$
解得
$T(n) = O(n\log n)$
(归并排序的时间复杂度)

每次递归需要 $O(1)$ :

$T(n) = 2T(n/2) + c$
解得
$T(n) = O(n)$

每次递归需要 $O(n^2)$ :

$T(n) = 2T(n/2) + cn^2$
解得
$T(n) = O(n^2)$

优先级队列(priority queue)#

实现某个集合 $S$ , $S$ 中的每个元素都有对应的优先级(不同元素的优先级可能相同). $S$ 应当支持如下操作:

插入新元素
(移除并)获取最大值.
改变某元素的优先级

堆是优先级队列的一种实现

堆(Heap)#

是一个数组, 也可以看作一个完全二叉树, 堆h的根节点为h[1]. 对于节点x来说:

其父节点: x = 1时无父节点, x > 1时父节点为x // 2, 或x >> 1.
左子节点: 2 * x, 或x << 1.
右子节点: 2 * x + 1, 或x << 1 + 1. 大根堆: 每个节点都大于它的子节点(如果存在). 小根堆类似.

二叉堆的性质

叶子节点永远占一半(ceil(n/2)), 非叶子结点占(floor(n/2)). 这同时也说明最后一个叶子节点的编号除以2即是最后一个非叶子结点.

堆的操作#

建大根堆: 从无序数组中建立一个大根堆.
大根堆化(max_heapify)
插入
(移除并)获取最大值
堆排序.

大根堆化(max_heapify)#

对于单个违反堆序的节点, 将其移动至正确的位置, 设以其为根节点的子堆高度为n, 则大根堆化所需时间复杂度为 $O(log n)$ .

概括地说, 对于某个需要大根堆化的节点x, 持续将x与左右子节点比较, 交换, 直到:

x大于左节点, 且x大于右节点
x已无左右子节点

大根堆化

建大根堆#

从n/2位置开始逆序遍历到1, 每个节点都进行大根堆化. (从n/2位置开始是因为大于n/2的节点均是叶子节点, 无需大根堆化, 逆序是为了保证建堆过程的无后效性)

时间复杂度: $T = c(\frac{n}{4} + \frac{2n}{8} + \frac{3n}{16} + \frac{4n}{32} + ... + \log n)$ 求和后可以知道时间复杂度为 $O(n)$ .

建大根堆

堆排序(heap sort)#

建大根堆A
弹出A[1]
交换A[n]与A[1], A的size减少1.
对A[1]大根堆化.
重复步骤2到5, 直到A的size为0.

时间复杂度: $O(nlog n)$

代码实现#

1
def max_heapify(A, x):
2
    """
3
    将堆A (A[0]储存堆元素个数n, 实际索引从1开始) 的某个违反堆序的节点x大根堆化.
4
    """
5
    lt = 2 * x
6
    rt = 2 * x + 1
7
    # 确定x, lt, rt中最大的, 并与x交换
8
    if lt <= A[0] and A[x] < A[lt]:
9
        largest = lt
10
    else:
11
        largest = x
12

13
    if rt <= A[0] and A[rt] > A[largest]:
14
        largest = rt
15

16
    # 如果并未交换, 说明x在正确位置上
17
    if x == largest:
18
        return
19

20
    # 否则则说明x在largest位置上, 递归
21
    A[x], A[largest] = A[largest], A[x]
22
    max_heapify(A, largest)
23

24
def build_max_heap(A):
25
    """
26
    从数组A建堆(索引从0开始)
27
    """
28
    # 保证堆的索引从1开始, A[0]储存堆的元素个数n.
29
    A.insert(0, len(A))
30

31
    # 从n / 2开始倒序遍历.
32
    i = A[0] // 2
33
    while i > 0:
34
        max_heapify(A, i)
35
        i -= 1
36

37
def insert(A, val):
38
    """
39
    向堆A(A[0]储存元素个数n, 索引从1开始)中插入值val
40
    """
41
    # 若数组大小不足, append, 反之则赋值
42
    idx = A[0] + 1
43
    if idx == len(A):
44
        A.append(val)
45
    else:
46
        A[idx] = val
47
    A[0] += 1
48

49
    # 上滤, 每次比较节点与其父节点
50
    while idx != 1 and A[idx] > A[idx // 2]:
51
        A[idx], A[idx // 2] = A[idx // 2], A[idx]
52
        idx = idx // 2
53

54
def top(A):
55
    """
56
    查询堆顶
57
    """
58
    return A[1]
59

60
def pop(A):
61
    """
62
    移除(塞至堆尾并size-1)并返回堆顶
63
    """
64
    last = A[0]
65
    A[1], A[last] = A[last], A[1]
66
    A[0] -= 1
67
    max_heapify(A, 1)
68
    return A[last]
69

70
def heap_sort(A):
71
    """
72
    对某数组A进行升序堆排序, 索引从0开始
73
    一种O(nlog n)的原地非稳定排序.
74
    """
75
    build_max_heap(A)
76
    while A[0] > 0:
77
        pop(A)      # 每次将最大元素塞至队尾
78
    del A[0]        # 将A恢复为数组

二叉搜索树(BST)#

数组的二分查找可以做到 $O(\log n)$ , 但插入操作为 $O(n)$ .

链表可以做到 $O(1)$ 插入, 但无法二分查找, 查找操作为 $O(1)$ .

链式存储并非绝对不适合二分查找. 回顾数组的二分查找过程:

我们找到数组的中点, 并以此为基准将其分成两半
在左半边或右半边的数组进行查找, 数组的元素数量减半.
继续下去, 直至数组的元素个数为1.

我们可以发现, 这个过程中访问的节点可以形成一个有向图. 原数组中点指向左数组中点和右数组中点, 而左数组中点又指向1/4数组中点和2/4数组中点… 每个节点都指向对应的左半边中点和右半边中点, 形成一个二叉树.

这个二叉树就是二叉搜索树(Binary Search Tree, BST), 它是一种能够进行二分查找的链式数据结构. 在理想状态下具有 $O(\log n)$ 的查找效率. 相应地, 插入操作会变慢, 为 $O(\log n)$ .

二叉搜索树的递归定义如下:

空树是一个二叉搜索树.
二叉搜索树的子树是二叉搜索树.
对于任意节点 $x$ , $x$ 大于其左子树的所有节点, 小于其右子树的所有节点.

BST的操作#

插入: insert(root, val)
查找: search(root, val)
查最大/最小值: find_max(root), find_min(root)
删除: delete(root, val)
减少: reduce(root, val), 使某值的数量减少1

BST排序#

容易注意到, BST做中序遍历(按照左子树-根-右子树的顺序)即为排序. 复杂度 $O(n)$

1
class TreeNode:
2
    """
3
    :key: 该节点的键(或值)
4
    :num: 该值的数量
5
    :left: 指向左子树的指针
6
    :right: 指向右子树的指针
7
    """
8
    def __init__(self, key):
9

10
        self.key, self.num = key, 1
11
        self.left = None
12
        self.right = None
13

14
    def __repr__(self):
15
        return str_of_tree(self)
16

17

18
def str_of_tree(root: TreeNode, h = 0) -> str:
19
    """
20
    接收一个二叉树的根节点，返回整个树的可视化结构。
21
    :param root: 树的根节点
22
    :return: 字符串, 作为树的整个可视化结构
23
    """
24
    if root is None:
25
        return ''
26
    s = ''
27
    s += f"  Key:{root.key}({root.num})\n"
28
    s += h * '      ' + ' Left:'
29
    s += str_of_tree(root.left, h + 1)
30
    s += '\n'
31
    s += h * '      ' + 'Right:'
32
    s += str_of_tree(root.right, h + 1)
33
    return s
34

35

36
def insert(root: TreeNode, val: int) -> TreeNode:
37
    """
38
    通过myroot = insert(myroot, val)的方式向树中插入元素, 始终返回根节点
39
    :param root: 根节点
40
    :param val: 需要插入的值
41
    :return: 根节点
42
    """
43
    if root is None:
44
        return TreeNode(val)
45

46
    current = root
47
    while True:
48
        if current.key == val:
49
            current.num += 1
50
            return root
51
        elif val < current.key:
52
            if current.left is None:
53
                current.left = TreeNode(val)
54
                return root
55
            else:
56
                current = current.left
57
        elif val > current.key:
58
            if current.right is None:
59
                current.right = TreeNode(val)
60
                return root
61
            else:
62
                current = current.right
63

64

65
def search(root: TreeNode, val: int) -> TreeNode | None:
66
    """
67
    在树中搜索元素, 若找到返回该节点, 未找到返回None
68
    :param root: 根节点
69
    :param val: 搜索的值
70
    :return: 节点或None
71
    """
72
    if root is None:
73
        return None
74
    if val == root.key:
75
        return root
76

77
    current = root
78
    while True:
79
        if current is None:
80
            return None
81
        elif val == current.key:
82
            return current
83
        elif val > current.key:
84
            current = current.right
85
        elif val < current.key:
86
            current = current.left
87

88

89
def find_max(root: TreeNode) -> TreeNode | None:
90
    """
91
    找到树中的最大值
92
    :param root: 根节点
93
    :return: 空树返回None, 否则返回最大值节点
94
    """
95
    if root is None:
96
        return None
97

98
    current = root
99
    while True:
100
        if current.right is None:
101
            return current
102
        else:
103
            current = current.right
104

105

106
def find_min(root: TreeNode) -> TreeNode | None:
107
    """
108
    找到树中的最小值
109
    :param root: 根节点
110
    :return: 空树返回None, 否则返回最小值节点
111
    """
112
    if root is None:
113
        return None
114

115
    current = root
116
    while True:
117
        if current.left is None:
118
            return current
119
        else:
120
            current = current.left
121

122

123
def delete(root: TreeNode, val: int) -> TreeNode | None:
124
    """
125
    删除BST树中所有的某值
126
    :param root: BST树的根
127
    :param val: 需要删除的值
128
    :return: 删除后的根节点
129
    """
130
    if root is None:
131
        return None
132

133
    if val > root.key:
134
        root.right = delete(root.right, val)
135
    elif val < root.key:
136
        root.left = delete(root.left, val)
137
    elif val == root.key:
138
        if root.left is None:
139
            root = root.right
140
        elif root.right is None:
141
            root = root.left
142
        else:
143
            new_root = find_min(root.right)
144
            root.key, new_root.key = new_root.key, root.key
145
            root.num, new_root.num = new_root.num, root.num
146
            root.right = delete(root.right, val)
147
    return root
148

149

150
def reduce(root: TreeNode, val: int) -> TreeNode | None:
151
    """
152
    使BST树中某值的数量减少1
153
    :param root: BST树的根
154
    :param val: 某值
155
    :return: 减少后的根节点
156
    """
157
    if root is None:
158
        return None
159

160
    if val > root.key:
161
        root.right = reduce(root.right, val)
162
    elif val < root.key:
163
        root.left = reduce(root.left, val)
164
    elif val == root.key:
165
        if root.num >= 2:
166
            root.num -= 1
167
        else:
168
            if root.left is None:
169
                root = root.right
170
            elif root.right is None:
171
                root = root.left
172
            else:
173
                new_root = find_min(root.right)
174
                root.key, new_root.key = new_root.key, root.key
175
                root.num, new_root.num = new_root.num, root.num
176
                root.right = delete(root.right, val)
177
    return root
178

179

180
def build_BST(arr: list) -> TreeNode:
181
    """
182
    从数组建立BST, 理想情况下复杂度为O(nlog n), 最差复杂度为O(n^2)
183
    :param arr: 数组
184
    :return: BST树的根节点
185
    """
186
    bst = None
187
    for i in arr:
188
        bst = insert(bst, i)
189
    return bst
190

191

192
def BST_sort(arr: list) -> list:
193
    """
194
    利用BST排序, 理想情况下复杂度为O(nlog n), 最差复杂度为O(n^2)
195
    :param arr: 数组
196
    :return: 排序后的数组
197
    """
198
    ans = []
199

200
    # 递归辅助函数
201
    def helper(root: TreeNode) -> None:
202
        if root is None:
203
            return
204
        helper(root.left)
205
        t = root.num
206
        while t:
207
            ans.append(root.key)
208
            t -= 1
209
        helper(root.right)
210

211
    helper(build_BST(arr))
212
    return ans

AVL树#

发明rotate操作的人真是天才! —ykindred

BST在数据分布不佳的情况下, 操作的时间复杂度会由O(log n)退化为O(n). 这主要是子树之间高度不平衡导致的.

最早的自平衡二叉搜索树(或高度平衡树, 也可简称为平衡树)是AVL树. 得名于它的发明者G. M. Adelson-Velsky和E. M. Landis.

AVL树满足如下性质:

AVL树本身是一棵二叉搜索树
其每个节点的左子树高度与右子树高度差(平衡因子, BF)的绝对值不超过1.

由于需要检查子树高度差, 所以AVL的每个节点额外储存高度数据, 每次插入或删除时自底向上地更新.

AVL可以将插入, 删除, 查找等操作保证在O(log n)量级.

旋转操作(Rotate)#

左旋(Left Rotate): BF = -2时, 进行如下操作:
1. 右孩子变根.
2. 右孩子的左孩子变根的右孩子.
3. 根变右孩子的左孩子.
右旋(Right Rotate)则相反. BF = 2时, 进行:
1. 左孩子变根.
2. 左孩子的右孩子变根的左孩子.
3. 根变左孩子的右孩子.

根据失衡路径, 可以分为如下四种情况:

LL型: 新节点在某节点z的左子树的左子树中, 此时对z进行一次右旋
RR型: 新节点在某节点z的右子树的右子树中, 此时对z进行一次左旋
LR型: 新节点在某节点z的左子树的右子树中, 此时先对z左子树做左旋(这可以将LR型转换为LL型), 再对z做右旋.
RL型: 新节点在某节点z的右子树的左子树中, 此时先对z右子树做右旋(这可以将RL型转换为RR型), 再对z做左旋.

1
class TreeNode:
2
    """
3
    :key: 该节点的键(或值)
4
    :num: 该值的数量
5
    :left: 指向左子树的指针
6
    :right: 指向右子树的指针
7
    :height: 某节点的高度
8
    """
9
    def __init__(self, key):
10
        self.key = key
11
        self.num = 1
12
        self.left = None
13
        self.right = None
14
        self.height = 1
15

16
    def __repr__(self):
17
        return str_of_tree(self)
18

19

20
def str_of_tree(root: TreeNode, h = 0) -> str:
21
    """
22
    接收一个二叉树的根节点，返回整个树的可视化结构。
23
    :param root: 树的根节点
24
    :return: 字符串, 作为树的整个可视化结构
25
    """
26
    if root is None:
27
        return ''
28
    s = ''
29
    s += f"  Key:{root.key}({root.num})\n"
30
    s += h * '      ' + ' Left:'
31
    s += str_of_tree(root.left, h + 1)
32
    s += '\n'
33
    s += h * '      ' + 'Right:'
34
    s += str_of_tree(root.right, h + 1)
35
    return s
36

37

38
def find_max(root: TreeNode) -> TreeNode | None:
39
    """
40
    找到树中的最大值
41
    :param root: 根节点
42
    :return: 空树返回None, 否则返回最大值节点
43
    """
44
    if root is None:
45
        return None
46

47
    current = root
48
    while True:
49
        if current.right is None:
50
            return current
51
        else:
52
            current = current.right
53

54

55
def find_min(root: TreeNode) -> TreeNode | None:
56
    """
57
    找到树中的最小值
58
    :param root: 根节点
59
    :return: 空树返回None, 否则返回最小值节点
60
    """
61
    if root is None:
62
        return None
63

64
    current = root
65
    while True:
66
        if current.left is None:
67
            return current
68
        else:
69
            current = current.left
70

71

72
def get_height(node: TreeNode) -> int:
73
    """
74
    给出某一节点的高度
75
    :param node: 某节点
76
    """
77
    if node is None:
78
        return 0
79
    return node.height
80

81

82
def get_BF(node: TreeNode) -> int:
83
    """
84
    给出某节点的平衡因子
85
    :param node: 某节点
86
    :return: 平衡因子, 整数
87
    """
88
    if node is None:
89
        return 0
90
    return get_height(node.left) - get_height(node.right)
91

92

93
def right_rotate(node: TreeNode) -> TreeNode:
94
    """
95
    对某子树做右旋, 返回右旋后的根节点
96
    :param node: 某节点
97
    :return: 右旋后的根节点
98
    """
99
    # 右旋时左子树不应为空
100
    assert not (node.left is None), "left child shouldn't be None"
101

102
    # 旋转
103
    new_root, new_right, new_right_left = node.left, node, node.left.right
104
    node, node.right, node.right.left = new_root, new_right, new_right_left
105

106
    # 更新高度
107
    new_root.right.height = 1 + max(get_height(new_root.right.left), get_height(new_root.right.right))
108
    new_root.height = 1 + max(get_height(new_root.left), get_height(new_root.right))
109

110
    return new_root
111

112

113
def left_rotate(node: TreeNode) -> TreeNode:
114
    """
115
    对某子树做左旋, 返回左旋后的根节点
116
    :param node: 某节点
117
    :return: 左旋后的根节点
118
    """
119
    # 左旋时右子树不应为空
120
    assert not (node.right is None), "right child shouldn't be None"
121

122
    # 旋转
123
    new_root, new_left, new_left_right = node.right, node, node.right.left
124
    node, node.left, node.left.right = new_root, new_left, new_left_right
125

126
    # 更新高度
127
    new_root.left.height = 1 + max(get_height(new_root.left.right), get_height(new_root.left.left))
128
    new_root.height = 1 + max(get_height(new_root.right), get_height(new_root.left))
129
    return new_root
130

131

132
def rotate(root: TreeNode) -> TreeNode:
133
    """
134
    对某节点做平衡检查, 若失衡则旋转
135
    :param root: 某节点
136
    :return: 检查后的根节点
137
    """
138
    BF = get_BF(root)
139
    left_BF = get_BF(root.left)
140
    right_BF = get_BF(root.right)
141

142
    if BF >= 2:
143
        # LL
144
        if left_BF == 1:
145
            root = right_rotate(root)
146
        # LR
147
        elif left_BF == -1:
148
            root.left = left_rotate(root.left)
149
            root = right_rotate(root)
150

151
    elif BF <= -2:
152
        # RR
153
        if right_BF == -1:
154
            root = left_rotate(root)
155
        # RL
156
        elif right_BF == 1:
157
            root.right = right_rotate(root.right)
158
            root = left_rotate(root)
159
    return root
160

161

162
def insert(root: TreeNode, val: int) -> TreeNode:
163
    """
164
    向AVL树中插入某值
165
    :param root: AVL树的根
166
    :param val: 某值
167
    :return: 插入后的根节点
168
    """
169
    # 标准BST插入
170
    if root is None:
171
        return TreeNode(val)
172

173
    if val == root.key:
174
        root.num += 1
175
    elif val > root.key:
176
        root.right = insert(root.right, val)
177
    elif val < root.key:
178
        root.left = insert(root.left, val)
179

180
    # 更新高度
181
    root.height = 1 + max(get_height(root.left), get_height(root.right))
182

183
    # 检查旋转
184
    root = rotate(root)
185

186
    return root
187

188

189
def delete(root: TreeNode, val: int) -> TreeNode | None:
190
    """
191
    删除AVL树中所有的某值
192
    :param root: AVL树的根
193
    :param val: 需要删除的值
194
    :return: 删除后的根节点
195
    """
196
    # 标准BST删除
197
    if root is None:
198
        return None
199

200
    if val > root.key:
201
        root.right = delete(root.right, val)
202
    elif val < root.key:
203
        root.left = delete(root.left, val)
204
    elif val == root.key:
205
        if root.left is None:
206
            root = root.right
207
        elif root.right is None:
208
            root = root.left
209
        else:
210
            new_root = find_min(root.right)
211
            root.key, new_root.key = new_root.key, root.key
212
            root.num, new_root.num = new_root.num, root.num
213
            root.right = delete(root.right, val)
214

215
    if not(root is None):
216
        # 更新高度
217
        root.height = 1 + max(get_height(root.left), get_height(root.right))
218

219
        # 检查旋转
220
        root = rotate(root)
221

222
    return root
223

224

225
def search(root: TreeNode, val: int) -> TreeNode | None:
226
    """
227
    在树中搜索元素, 若找到返回该节点, 未找到返回None
228
    :param root: 根节点
229
    :param val: 搜索的值
230
    :return: 节点或None
231
    """
232
    if root is None:
233
        return None
234
    if val == root.key:
235
        return root
236

237
    current = root
238
    while True:
239
        if current is None:
240
            return None
241
        elif val == current.key:
242
            return current
243
        elif val > current.key:
244
            current = current.right
245
        elif val < current.key:
246
            current = current.left
247

248

249
def reduce(root: TreeNode, val: int) -> TreeNode | None:
250
    """
251
    使AVL树中某值的数量减少1
252
    :param root: AVL树的根
253
    :param val: 某值
254
    :return: 减少后的根节点
255
    """
256
    # 标准BST减少
257
    if root is None:
258
        return None
259

260
    if val > root.key:
261
        root.right = reduce(root.right, val)
262
    elif val < root.key:
263
        root.left = reduce(root.left, val)
264
    elif val == root.key:
265
        if root.num >= 2:
266
            root.num -= 1
267
            return root
268
        else:
269
            if root.left is None:
270
                root = root.right
271
            elif root.right is None:
272
                root = root.left
273
            else:
274
                new_root = find_min(root.right)
275
                root.key, new_root.key = new_root.key, root.key
276
                root.num, new_root.num = new_root.num, root.num
277
                root.right = delete(root.right, val)
278

279
    if not (root is None):
280
        # 更新高度
281
        root.height = 1 + max(get_height(root.left), get_height(root.right))
282

283
        # 检查旋转
284
        root = rotate(root)
285

286
    return root
287

288

289
def build_AVL(arr: list) -> TreeNode:
290
    """
291
    从数组建立AVL, 复杂度为O(nlog n)
292
    :param arr: 数组
293
    :return: AVL树的根节点
294
    """
295
    avl = None
296
    for i in arr:
297
        avl = insert(avl, i)
298
    return avl
299

300

301
def AVL_sort(arr: list) -> list:
302
    """
303
    利用AVL排序, 复杂度为O(nlog n)
304
    :param arr: 数组
305
    :return: 排序后的数组
306
    """
307
    ans = []
308

309
    # 递归辅助函数
310
    def helper(root: TreeNode) -> None:
311
        if root is None:
312
            return
313
        helper(root.left)
314
        t = root.num
315
        while t:
316
            ans.append(root.key)
317
            t -= 1
318
        helper(root.right)
319

320
    helper(build_AVL(arr))
321
    return ans

比较排序的时间复杂度下界#

是 $O(nlog n)$ , 可以分3步证明:

比较排序的过程可以用决策树来表示
每种排列对应每个独特的路径(结果), 所以共有 $n!$ 种路径(结果), 也就是决策树有 $n!$ 个叶子节点
树高至少为 $\log (n!)$ , 数学上可以证明 $\log(n!) = O(n\log n)$ .

故任何基于比较的排序算法, 其最坏情况的时间复杂度至少为 $O(n\log n)$ . 而同样可以证明平均情况下的时间复杂度下界依然是 $\Omega(n\log n)$ .

此处只介绍思想, 详细证明略.

计数排序#

我们已经证明, 比较排序的时间复杂度下界为 $O(n\log n)$ , 这是排序界的”名作之壁”, 能达到这个复杂度的排序算法我们都可以认为其在时间上是”不劣”的. 那么有没有不基于比较的更快的排序算法呢?

有的兄弟有的, 这就是非比较排序, 我们主要介绍计数排序和基数排序.

计数排序(counting sort): 对于小的非负整数的排序, 具有线性时间复杂度 $O(n)$ .

对每个整数都使用一个变量来记录其数量, 最后从小到大输出即可.

容易注意到, 这是一种非原地排序, 可以实现稳定. 并且对排序的数据有要求(必须是小的非负整数或者能够映射到小的非负整数).

1
def counting_sort(arr: list, maxx: int = int(300)) -> list:
2
    """
3
    一种O(n)的不稳定非原地排序, 只针对小的非负整数
4
    :param arr: 需要排序的数组
5
    :param maxx: 数据的上界
6
    :return: 排序后的数组
7
    >>> a = [12,5,6,8,9,1,2,0,2,6,5,8,7,4,5,6,4,6]
8
    >>> counting_sort(a, 200)
9
    [0, 1, 2, 2, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 8, 8, 9, 12]
10
    >>> counting_sort(a)
11
    [0, 1, 2, 2, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 8, 8, 9, 12]
12
    """
13
    cnt = [0 for _ in range(maxx)]
14
    ans = []
15
    for i in arr:
16
        cnt[i] += 1
17
    for i in range(len(cnt)):
18
        while cnt[i]:
19
            ans.append(i)
20
            cnt[i] -= 1
21
    return ans

基数排序#

计数排序有很多问题:

只能排序整数
在数据范围较大的时候, 排序的空间复杂度过高

**基数排序(radix sort)**牺牲了常数(普遍认为)复杂度, 解决了这两个问题.

基数排序的基本思想: 将整数分成若干个位数, 对每个位数进行计数排序.

基数排序可以对字符串进行排序. 基数排序分为LSD(从低位到高位)和MSD(从高位到低位). 这里仅以LSD作为演示.

1
from math import log10, ceil
2
def radix_sort_LSD(arr: list, maxx: int = 10000):
3
    """
4
    一种线性时间复杂度的稳定非原地排序. 从低位到高位排序
5
    :param arr: 数组
6
    :param maxx: 数据的上界
7
    :return: 排序后的数组
8
    >>> a = [12,5,6,8,9,1,2,0,2,6,5,8,7,4,5,6,4,6]
9
    >>> radix_sort_LSD(a, 12)
10
    [0, 1, 2, 2, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 8, 8, 9, 12]
11
    >>> radix_sort_LSD(a)
12
    [0, 1, 2, 2, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 8, 8, 9, 12]
13
    """
14
    k = ceil(log10(maxx))
15
    ans = []
16
    base = 1
17
    modu = 10
18
    while k:
19
        bucket = [[] for _ in range(10)]
20
        ans.clear()
21
        for i in arr:
22
            idx = i % modu // base
23
            bucket[idx].append(i)
24
        for i in range(10):
25
            ans = ans + bucket[i]
26
        arr = ans.copy()
27
        base *= 10
28
        modu *= 10
29
        k -= 1
30
    return arr

插入排序(insertion sort)#

插入操作#

代码实现#

归并排序(merge sort)#

基础情况:#

递归情况:#

合并操作:#

复杂度分析:#

代码实现#

优先级队列(priority queue)#

堆(Heap)#

堆的操作#

大根堆化(max_heapify)#

建大根堆#

堆排序(heap sort)#

代码实现#

二叉搜索树(BST)#

BST的操作#

BST排序#

AVL树#

旋转操作(Rotate)#

比较排序的时间复杂度下界#

计数排序#

基数排序#

更多排序#