1,二叉樹(Binary tree)
二叉樹:每一個節點最多兩個子節點,如下圖所示:
相關概念:節點Node,路徑path,根節點root,邊edge,子節點 children,父節點parent,兄弟節點sibling, 子樹subtree,葉子節點leaf node, 度level,樹高hight

節點Node:
路徑path:從一個節點到擰一個節點間的邊
根節點root,
邊edge:節點間的連線
子節點 children,
父節點parent,
兄弟節點sibling,
子樹subtree,
葉子節點leaf node,
度level:從當前節點到根節點的路徑中邊的數量
高度 hight:樹中所有節點的最大level
二叉樹可以通過多級列表的形式實現,多級列表形式如下,根節點r,有兩個子節點a , b,且a, b節點沒有子節點。
mytree =[ r,
[ a, [ ], [ ] ], [ b, [ ], [ ] ]
]
python實現代碼如下:

#coding:utf-8 #多級列表實現 def binaryTree(r): return [r,[],[]] #root[]為根節點,root[1]左子樹,root[2]右子樹 def insertLeftTree(root,newbranch): t = root.pop(1) if len(t)>1: root.insert(1, [newbranch, t, []]) else: root.insert(1,[newbranch, [], []]) return root def insertRightTree(root,newbranch): t = root.pop(2) if len(t)>1: root.insert(2, [newbranch, [], t]) else: root.insert(2,[newbranch, [], []]) return root def getRootVal(root): return root[0] def setRootVal(root,val): root[0]= val def getLeftChildren(root): return root[1] def getRightChildren(root): return root[2] r = binaryTree(3) insertLeftTree(r,4) insertLeftTree(r,5) insertRightTree(r,6) insertRightTree(r,7) l = getLeftChildren(r) print(l) setRootVal(l,9) print(r) insertLeftTree(l,11) print(r) print(getRightChildren(getRightChildren(r)))
二叉樹可以通過節點的形式實現,如下所示:
python實現代碼如下:

class BinaryTree(object): def __init__(self,value): self.key = value self.leftChild = None self.rightChild = None def insertLeft(self,newNode): if self.leftChild != None: temp = BinaryTree(newNode) temp.leftChild = self.leftChild self.leftChild = temp else: self.leftChild = BinaryTree(newNode) def insertRight(self,newNode): if self.rightChild != None: temp = BinaryTree(newNode) temp.rightChild= self.rightChild self.rightChild = temp else: self.rightChild = BinaryTree(newNode) def getRootVal(self): return self.key def setRootVal(self,value): self.key = value def getLeftChild(self): return self.leftChild def getRightChild(self): return self.rightChild
2,二叉樹的應用
2.1 解析樹(parse tree)
解析樹常用於表示真實世界的結構表示,如句子和數學表達式。如下圖是((7+3)*(5-2))的解析樹表示,根據解析樹的層級結構,從下往上計算,能很好的代替括號的表達式中括號的作用
將一個全括號數學表達式轉化為解析樹的過程如下:
遍歷表達式:
1,若碰到“(”,為當前節點插入左節點,並移動到左節點
2,若碰到 + ,- ,* , /,設置當前節點的值為該符號,並為當前節點插入右節點,並移動到右節點
3,若碰到數字,設置當前節點的值為該數字,並移動到其父節點
4,若碰到“)”,移動到當前節點的父節點
python實現代碼如下:(Stack 參見數據結構之棧 )

from stackDemo import Stack #參見數據結構之棧 def buildParseTree(expstr): explist = expstr.split() s = Stack() t = BinaryTree('') s.push(t) current = t for token in explist: #token = token.strip() if token =='(': current.insertLeft('') s.push(current) current = current.getLeftChild() elif token in ['*','/','+','-']: current.setRootVal(token) current.insertRight('') s.push(current) current = current.getRightChild() elif token not in ['(','*','/','+','-',')']: current.setRootVal(token) current = s.pop() elif token==')': current = s.pop() else: raise ValueError return t t = buildParseTree("( ( 10 + 5 ) * 3 )")
計算解析樹:數學表達式轉化為解析樹后,可以對其進行計算,python代碼如下:

import operator def evaluate(parseTree): operators={'+':operator.add,'-':operator.sub,'*':operator.mul,'/':operator.div } rootval = parseTree.getRootVal() left = parseTree.getLeftChild() right = parseTree.getRightChild() if left and right: fn = operators[rootval] return fn(evaluate(left),evaluate(right)) else: return parseTree.getRootVal()
中序遍歷解析樹,可以將其還原為全括號數學表達式,python代碼如下:

#解析樹轉換為全括號數學表達式 def printexp(tree): val = '' if tree: val = '('+printexp(tree.getLeftChild()) val = val +str(tree.getRootVal()) val = val +printexp(tree.getRightChild())+')' if tree.getLeftChild()==None and tree.getRightChild()==None: val = val.strip('()') return val t = buildParseTree("( ( 10 + 5 ) * 3 )") exp = printexp(t) print exp
3,樹的遍歷
樹的遍歷包括前序遍歷(preorder),中序遍歷(inorder)和后序遍歷(postorder).
前序遍歷:先訪問根節點,再訪問左子樹,最后訪問右子樹(遞歸),python代碼實現如下:

def preorder(tree): if tree: print tree.getRootVal() preorder(tree.getLeftChild()) preorder(tree.getRightChild()) #定義在類中的前序遍歷 # def preorder(self): # print self.key # if self.leftChild: # self.leftChild.preorder() # if self.rightChild: # self.rightChild.preorder()
中序遍歷:先訪問左子樹,再訪問根節點,最后訪問右子樹(遞歸),python代碼實現如下:

#中序遍歷inorder def inorder(tree): if tree: preorder(tree.getLeftChild()) print tree.getRootVal() preorder(tree.getRightChild())
后續遍歷:先訪問左子樹,再訪問右子樹,最后訪問根節點,python代碼實現如下:

def postorder(tree): if tree : postorder(tree.getLeftChild()) postorder(tree.getRightChild()) print(tree.getRootVal())
樹的層次遍歷,樹的深度,前序遍歷和中序遍歷構建樹,判斷兩棵樹是否相同:

class TreeNode(object): def __init__(self, data, leftchild=None, rightchild=None): self.data = data self.leftchild = leftchild self.rightchild = rightchild def preorder(self): print self.data if self.leftchild: self.leftchild.preorder() if self.rightchild: self.rightchild.preorder() def midorder(self): if self.leftchild: self.leftchild.preorder() print self.data if self.rightchild: self.rightchild.preorder() t1 = TreeNode(4,TreeNode(3,TreeNode(5,TreeNode(10)),TreeNode(8)),TreeNode(9,TreeNode(7),TreeNode(12))) # #層次遍歷 def lookup(root): row=[root] while row: print [x.data for x in row] temp=[] for item in row: if item.leftchild: temp.append(item.leftchild) if item.rightchild: temp.append(item.rightchild) row = temp lookup(t1) #樹的深度 def get_height(root): if root ==None: return 0 return max(get_height(root.leftchild),get_height(root.rightchild))+1 print(get_height(t1)) #根據前序遍歷和中序遍歷構建樹 pre=[4,3,5,10,8,9,7,12] # t1.preorder() mid=[3,5,10,8,4,9,7,12] # t1.midorder() def build(pre,mid): if not pre: return None node = TreeNode(pre[0]) index = mid.index(pre[0]) node.leftchild = build(pre[1:index+1],mid[:index]) node.rightchild = build(pre[index+1:],mid[index+1:]) return node tt = build(pre,mid) tt.preorder() #判斷兩棵樹是否相同 t1 = TreeNode(4,TreeNode(3,TreeNode(5,TreeNode(10)),TreeNode(8)),TreeNode(9,TreeNode(7),TreeNode(12))) t2 = TreeNode(4,TreeNode(3,TreeNode(5,TreeNode(10)),TreeNode(8)),TreeNode(9,TreeNode(7),TreeNode(12))) t3 = TreeNode(4,TreeNode(3,TreeNode(8,TreeNode(40)),TreeNode(13)),TreeNode(9,TreeNode(7),TreeNode(12))) def is_same_tree(t1,t2): if t1==None and t2==None: return True elif t1 and t2: return is_same_tree(t1.leftchild,t2.leftchild) and t1.data==t2.data and is_same_tree(t1.rightchild,t2.rightchild) else: return False print(is_same_tree(t1,t2)) print(is_same_tree(t1,t3))
morris 遍歷:上面的前中后序遍歷方法都使用了遞歸,需要額外的空間,morris 遍歷為非遞歸,空間復雜度為O(1), 當二叉樹數據量龐大時更加適用
Morris遍歷算法的步驟如下:(中序遍歷)
1, 根據當前節點,找到其前序節點,如果前序節點的右孩子是空,那么把前序節點的右孩子指向當前節點,然后進入當前節點的左孩子。
2, 如果當前節點的左孩子為空,打印當前節點,然后進入右孩子。
3,如果當前節點的前序節點其右孩子指向了它本身,那么把前序節點的右孩子設置為空,打印當前節點,然后進入右孩子。
前序節點:給定某個節點,在中序遍歷中,直接排在它前面的節點,我們稱之為該節點的前序節點
前序節點尋找算法:
如果該節點有左孩子,那么從左孩子開始,沿着左孩子的右孩子指針一直向下走到底,得到的節點就是它的前序節點

class TreeNode(object): def __init__(self, data, leftchild=None, rightchild=None): self.data = data self.leftchild = leftchild self.rightchild = rightchild def preorder(self): print self.data if self.leftchild: self.leftchild.preorder() if self.rightchild: self.rightchild.preorder() def midorder(self): if self.leftchild: self.leftchild.midorder() print self.data if self.rightchild: self.rightchild.midorder() t1 = TreeNode(4,TreeNode(3,TreeNode(5,TreeNode(10)),TreeNode(8)),TreeNode(9,TreeNode(7),TreeNode(12))) #morris遍歷 def morris(root): if root==None: return None cur=root while cur!=None: if cur.leftchild==None: print cur.data cur = cur.rightchild else: pre = get_predecessor(cur) if pre.rightchild==None: pre.rightchild=cur cur = cur.leftchild elif(pre.rightchild==cur): pre.rightchild=None print cur.data cur = cur.rightchild def get_predecessor(node): pre = node if pre.leftchild!=None: pre = pre.leftchild while pre.rightchild!=None and pre.rightchild!=node: pre = pre.rightchild return pre t1.midorder() print("="*20) morris(t1)
4,優先隊列和二叉堆(priority queue and binary heap)
優先隊列:優先隊列和隊列類似,enqueue操作能加入元素到隊列末尾,dequeue操作能移除隊列首位元素,不同的是優先隊列的元素具有優先級,首位元素具有最高或最小優先級,因此當進行enqueue操作時,還需要根據元素的優先級將其移動到適合的位置。優先隊列一般利用二叉堆來實現,其enqueue和dequeue的復雜度都為O(logn)。(也可以用list來實現,但list的插入復雜度為O(n),再進行排序的復雜度為O(n logn))
二叉堆:二叉堆是一顆完全二叉樹,當父節點的鍵值總是大於或等於任何一個子節點的鍵值時為最大堆,當父節點的鍵值總是小於或等於任何一個子節點的鍵值時為最小堆。(完全二叉樹:除最后一層外,每一層上的節點數均達到最大值;在最后一層上只缺少右邊的若干結點;滿二叉樹:除葉子結點外的所有結點均有兩個子結點。節點數達到最大值。所有葉子結點必須在同一層上)
最小堆示例及操作如下:(父節點的值總是小於或等於子節點)

BinaryHeap() #創建空的二叉堆 insert(k) #插入新元素 findMin() #返回最小值,不刪除 delMin() #返回最小值,並刪除 isEmpty() size() buildHeap(list) #通過list創建二叉堆
對於完全二叉樹,若根節點的序號為p,則左右節點的序號應該為2p和2p+1,結合上圖可以發現,可以用一個隊列(首位元素為0)來表示二叉堆的結構。最小堆的python實現代碼如下:(heaplist中第一個元素為0,不會用到,只是為了保證二叉堆的序列從1開始,方便進行除和乘2p,2p+1)

#coding:utf-8 class BinaryHeap(object): def __init__(self): self.heapList=[0] self.size = 0 #將元素加到完全二叉樹末尾,然后再根據其大小調整其位置 def insert(self,k): self.heapList.append(k) self.size = self.size+1 self._percUp(self.size) # 如果當前節點比父節點小,和父節點交換位置,一直向上重復該過程 def _percUp(self,size): i = size while i>0: if self.heapList[i]<self.heapList[i//2]: temp = self.heapList[i] self.heapList[i] = self.heapList[i//2] self.heapList[i//2] = temp i=i//2 # 將根元素返回,並將最末尾元素移動到根元素保持完全二叉樹結構不變,再根據大小,將新的根元素向下移動到合適的位置 def delMin(self): temp = self.heapList[1] self.heapList[1]=self.heapList[self.size] self.size = self.size-1 self.heapList.pop() self._percDown(1) return temp # 如果當前節點比最小子節點大,和該子節點交換位置,一直向下重復該過程 def _percDown(self,i): while (2*i)<=self.size: mc = self._minChild(i) if self.heapList[i]>self.heapList[mc]: temp = self.heapList[i] self.heapList[i]=self.heapList[mc] self.heapList[mc] =temp i = mc #返回左右子節點中較小子節點的位置 def _minChild(self,i): if (2*i+1)>self.size: return 2*i else: if self.heapList[2*i] < self.heapList[2*i+1]: return 2*i else: return 2*i+1 #通過一個list建立二叉堆 def buildHeap(self,list): i = len(list)//2 self.heapList = [0]+list[:] self.size = len(list) while i>0: self._percDown(i) i = i-1
insert()插入過程示例圖如下:將元素加到完全二叉樹末尾,然后再根據其大小調整其位置
delMin()操作過程示例如下:將根元素返回,並將最末尾元素移動到根元素保持完全二叉樹結構不變,再根據大小,將新的根元素向下移動到合適的位置
insert和delMin的復雜度都為O(log n), buildHeap的復雜度為O(n),利用二叉堆對list進行排序,復雜度為O(n log n),代碼如下:

#通過list構造二叉堆,然后不斷將堆頂元素返回,就得到排序好的list alist = [54,26,93,17,98,77,31,44,55,20] h = BinaryHeap() h.buildHeap(alist) s=[] while h.size>0: s.append(h.delMin()) print s

#堆排序 def build_min_heap(alist): size = len(alist) hq = [0]+alist i = len(alist)//2 while i>0: movedown(hq,i,size) i = i-1 return hq def movedown(hq,i,size): while (2*i)<=size: small = 2*i if 2*i+1<=size and hq[2*i]>hq[2*i+1]: small = 2*i+1 if hq[i]>hq[small]: hq[i],hq[small] = hq[small],hq[i] i = small def heappop(hq): temp = hq[1] hq[1]=hq[-1] hq.pop() movedown(hq,1,len(hq)-1) return temp alist = [2,4,6,7,1,2,5,25,15,20,1,21,33,18,29] q = build_min_heap(alist) t = [] for i in range(len(alist)): t.append(heappop(q)) print t

#coding:utf-8 #堆排序 def build_max_heap(alist): length = len(alist) for i in range(length/2,-1,-1): heapify(alist,i,length) def heapify(alist,i,length): left = 2*i+1 right = 2*i+2 largest = i if left<length and alist[left]>alist[largest]: largest = left if right<length and alist[right]>alist[largest]: largest = right if largest!=i: swap(alist,i,largest) heapify(alist,largest,length) def swap(alist,i,j): alist[i],alist[j] = alist[j],alist[i] def heapsort(alist): length = len(alist) build_max_heap(alist) for i in range(len(alist)-1,0,-1): swap(alist,0,i) length = length-1 heapify(alist,0,length) return alist alist = [2,4,6,7,1,2,5,80,10,9,25,15,20,1,21,33,18,29] print(heapsort(alist))
5,二叉搜索樹(Binary Search Tree, bst)
二叉搜索樹:左節點的值,總是小於其父節點的值,右節點的值總是大於其父節點的值(bst property)。如下圖所示:
利用python實現二叉搜索樹代碼如下:

#二叉查找樹 class TreeNode(object): def __init__(self,value,leftchild=None,rightchild=None,parent=None): self.value = value self.leftchild = leftchild self.rightchild = rightchild self.parent = parent def is_leaf(self): return not self.leftchild and not self.rightchild def is_leftchild(self): return self.parent.leftchild==self def is_rightchild(self): return self.parent.rightchild==self def has_both_children(self): return self.leftchild and self.rightchild def has_left_child(self): return self.leftchild def has_right_child(self): return self.rightchild def delete(self): if self.is_leftchild(): self.parent.leftchild=None elif self.is_rightchild(): self.parent.rightchild=None class BinarySearchTree(object): def __init__(self,node=None): self.root=node self.size = 0 def length(self): return self.szie def insert(self,value): if self.root==None: self.root = TreeNode(value) else: self._insert(self.root,value) def _insert(self,node,value): if node.value>value: if node.leftchild: self._insert(node.leftchild,value) else: temp = TreeNode(value) node.leftchild=temp temp.parent = node elif node.value<value: if node.rightchild: self._insert(node.rightchild,value) else: temp = TreeNode(value) node.rightchild=temp temp.parent = node else: print("%s已經存在"%value) def search(self,value): if self.root==None: return None else: return self._search(self.root,value) def _search(self,node,value): if node==None: return None if node.value>value: return self._search(node.leftchild,value) elif node.value<value: return self._search(node.rightchild,value) else: return node def delete(self,value): node = self._search(self.root,value) if node==None: return None if node.is_leaf(): #刪除節點為葉子結點 node.delete() elif node.has_both_children(): #刪除節點有兩個孩子 successor = self.find_min(node) node.value = successor.value if successor.is_leaf(): successor.delete() else: #successor 只可能有一個右節點 if successor.is_leftchild(): successor.parent.leftchild = successor.rightchild elif successor.is_rightchild(): successor.parent.rightchild = successor.rightchild successor.rightchild.parent = successor.parent else: #刪除節點只有一個孩子 if node.has_left_child(): if node.is_leftchild(): node.parent.leftchild=node.leftchild node.leftchild.parent=node.parent elif node.is_rightchild: node.parent.rightchild = node.leftchild node.leftchild.parent = node.parent elif node.has_right_child(): if node.is_leftchild(): node.parent.leftchild = node.rightchild node.rightchild.parent = node.parent elif node.is_rightchild(): node.parent.rightchild = node.rightchild node.rightchild.parent = node.parent def find_min(self,node): cur = node.rightchild while cur.leftchild: #右子樹的最小值 cur = cur.leftchild return cur def traverse(self): row=[self.root] while row: print([i.value for i in row]) temp=[] for node in row: if node.leftchild: temp.append(node.leftchild) if node.rightchild: temp.append(node.rightchild) row = temp if __name__=='__main__': root = BinarySearchTree() root.insert(18) root.insert(13) root.insert(8) root.insert(16) root.insert(28) root.insert(20) root.insert(38) root.traverse() root.insert(17) root.insert(10) print(root.search(16)) print(root.search(12)) print("*"*30) root.traverse() # print("delete leaf") # root.delete(10) # root.traverse() # print("delete node with one child") # root.delete(16) # root.traverse() print("delete node with two children") root.delete(13) root.traverse()
上述代碼中,進行節點刪除時注意有三種情況:
刪除節點為葉子結點:直接刪除節點,然后將其父節點的左子節點或右子節點設為None
刪除節點有一個孩子節點:利用子節點代替刪除節點原來的位置
刪除節點有兩個孩子節點:找到刪除節點的后繼節點(其左子樹的最右邊節點,或者是其右子樹的最左邊節點),利用后繼節點代替該節點的位置
利用二叉搜索樹可以實現map(字典),常用操作如下:

Map() # 創建字典 put(key,val) # 字典中插入數據 get(key) # 取鍵值 del # 刪除 len() # 求長度 in # 是否存在
python實現map代碼如下:

#coding:utf-8 class TreeNode(object): def __init__(self,key, value, leftChild=None,rightChild=None,parent=None): self.key = key self.value = value self.leftChild = leftChild self.rightChild = rightChild self.parent = parent self.balanceFactor =0 def hasLeftChild(self): return self.leftChild def hasRightChild(self): return self.rightChild def isLeftChild(self): return self.parent and self.parent.leftChild==self def isRightChild(self): return self.parent and self.parent.rightChild==self def isRoot(self): return not self.parent def isLeaf(self): return not (self.leftChild or self.rightChild) def hasAnyChildren(self): return self.leftChild or self.rightChild def hasBothChildren(self): return self.leftChild and self.rightChild def replaceNodeData(self,key,value,lc=None,rc=None): self.key=key self.value = value self.leftChild = lc self.rightChild = rc if self.hasLeftChild(): self.leftChild.parent = self if self.hasRightChild(): self.rightChild = self def __iter__(self): if self: if self.hasLeftChild(): for elem in self.leftChild: #調用self.leftChiLd.__iter__(),所以此處是遞歸的 yield elem yield self.key, self.value, self.balanceFactor if self.hasRightChild(): for elem in self.rightChild: #調用self.rightChiLd.__iter__() yield elem def findSuccessor(self): #尋找繼承 succ = None if self.hasRightChild(): succ = self.rightChild._findMin() else: if self.parent: if self.isLeftChild(): succ = self.parent else: self.parent.rightChild = None succ = self.parent.findSuccessor() self.parent.rightChild = self return succ def _findMin(self): current = self while current.hasLeftChild(): current = current.leftChild return current def spliceOut(self): if self.isLeaf(): if self.isLeftChild(): self.parent.leftChild=None else: self.parent.rightChild=None elif self.hasAnyChildren(): if self.hasLeftChild(): if self.isLeftChild(): self.parent.leftChild = self.leftChild else: self.parent.rightChild = self.leftChild self.leftChild.parent = self.parent else: if self.isLeftChild(): self.parent.leftChild = self.rightChild else: self.parent.rightChild = self.rightChild self.rightChild.parent = self.parent class BinarySearchTree(object): def __init__(self): self.root = None self.size = 0 def length(self): return self.size def __len__(self): return self.size def __iter__(self): return self.root.__iter__() #加入元素 def put(self,key,value): if self.root: self._put(key,value,self.root) else: self.root = TreeNode(key,value) self.size = self.size+1 def _put(self,key,value,currentNode): if currentNode.key<key: if currentNode.hasRightChild(): self._put(key,value,currentNode.rightChild) else: currentNode.rightChild=TreeNode(key,value,parent=currentNode) elif currentNode.key>key: if currentNode.hasLeftChild(): self._put(key,value,currentNode.leftChild) else: currentNode.leftChild=TreeNode(key,value,parent=currentNode) else: currentNode.replaceNodeData(key,value) def __setitem__(self, key, value): self.put(key,value) #獲取元素值 def get(self,key): if self.root: node = self._get(key,self.root) if node: return node.value else: return None else: return None def _get(self,key,currentNode): if not currentNode: return None if currentNode.key==key: return currentNode elif currentNode.key<key: return self._get(key,currentNode.rightChild) #rightChild可能不存在 else: return self._get(key,currentNode.leftChild) #leftChild可能不存在 # def _get(self,key,currentNode): # if currentNode.key == key: # return currentNode # elif currentNode.key<key: # if currentNode.hasRightChild(): # return self._get(key,currentNode.rightChild) # else: # return None # else: # if currentNode.hasLeftChild(): # return self._get(key,currentNode.leftChild) # else: # return None def __getitem__(self, key): return self.get(key) def __contains__(self, key): #實現 in 操作 if self._get(key,self.root): return True else: return False def delete(self,key): if self.size>1: node = self._get(key,self.root) if node: self._del(node) self.size = self.size - 1 else: raise KeyError('Error, key not in tree') elif self.size==1 and self.root.key==key: self.root = None self.size = self.size - 1 else: raise KeyError('Error, key not in tree') def _del(self,currentNode): if currentNode.isLeaf(): if currentNode.isLeftChild(): currentNode.parent.leftChild = None elif currentNode.isRightChild(): currentNode.parent.rightChild = None elif currentNode.hasBothChildren(): successor = currentNode.findSuccessor() #此處successor為其右子樹的最小值,即最左邊的值 successor.spliceOut() currentNode.key = successor.key currentNode.value = successor.value elif currentNode.hasAnyChildren(): if currentNode.hasLeftChild(): if currentNode.isLeftChild(): currentNode.parent.leftChild = currentNode.leftChild currentNode.leftChild.parent = currentNode.parent elif currentNode.isRightChild(): currentNode.parent.rightChild = currentNode.leftChild currentNode.leftChild.parent = currentNode.parent else: # currentNode has no parent (is root) currentNode.replaceNodeData(currentNode.leftChild.key, currentNode.leftChild.value, currentNode.leftChild.leftChild, currentNode.leftChild.rightChild) elif currentNode.hasRightChild(): if currentNode.isLeftChild(): currentNode.parent.leftChild = currentNode.rightChild currentNode.rightChild.parent = currentNode.parent elif currentNode.isRightChild(): currentNode.parent.rightChild = currentNode.rightChild currentNode.rightChild.parent = currentNode.parent else: # currentNode has no parent (is root) currentNode.replaceNodeData(currentNode.rightChild.key, currentNode.rightChild.value, currentNode.rightChild.leftChild, currentNode.rightChild.rightChild) def __delitem__(self, key): self.delete(key) if __name__ == '__main__': mytree = BinarySearchTree() mytree[8]="red" mytree[4]="blue" mytree[6]="yellow" mytree[5]="at" mytree[9]="cat" mytree[11]="mat" print(mytree[6]) print(mytree[5]) for x in mytree: print x del mytree[6] print '-'*12 for x in mytree: print x
在上述代碼中最復雜的為刪除操作,刪除節點時有三種情況:節點為葉子節點,節點有兩個子節點,節點有一個子節點。當節點有兩個子節點時,對其刪除時,應該用其右子樹的最小值來代替其位置(即右子樹中最左邊的值)。
對於map進行復雜度分析,可以發現put,get取決於tree的高度,當節點隨機分配時復雜度為O(log n),但當節點分布不平衡時,復雜度會變成O(n),如下圖所示:
6, 平衡二叉搜索樹 (Balanced binary search tree, AVL tree)
平衡二叉搜索樹:又稱為AVL Tree,取名於發明者G.M. Adelson-Velskii 和E.M. Landis,在二叉搜索樹的基礎上引入平衡因子(balance factor),每次插入和刪除節點時都保持樹平衡,從而避免上面出現的搜索二叉樹復雜度會變成O(n)。一個節點的balance factor的計算公式如下,即該節點的左子樹高度減去右子樹高度。
當樹所有節點的平衡因子為-1,0,1時,該樹為平衡樹,平衡因子大於1或小於-1時,樹不平衡需要調整,下圖為一顆樹的各個節點的平衡因子。(1時樹left-heavy,0時完全平衡,-1時right-heavy)
相比於二叉搜索樹,AVL樹的put和delete操作后,需要對節點的平衡因子進行更新,如果某個節點不平衡時,需要進行平衡處理,主要分為左旋轉和右旋轉。
左旋轉:如圖,節點A的平衡因子為-2(right heavy),不平衡,對其進行左旋轉,即以A為旋轉點,AB邊逆時針旋轉。
詳細操作為:1,A的右節點B作為新的子樹根節點
2,A成為B的左節點,如果B有左節點時,將其左節點變為A的右節點(A的右節點原來為B,所以A的右節點現在為空)
右旋轉:如圖,節點E的平衡因子為2(left heavy),不平衡,對其進行右旋轉,即以E為旋轉點,EC邊順時針旋轉。
詳細操作為:1,E的左節點C作為新的子樹根節點
2,E成為C的右節點,如果C有右節點時,將其右節點變為E的左節點(E的左節點原來為C,所以E的左節點現在為空)
特殊情況:當出現下面的情況時,如圖所示,A依舊為right heavy,但若進行左旋轉,又會出現left heavy,無法完成平衡操作。 所以在進行左旋轉和右旋轉前需要進行一步判斷,具體操作如下:
1,如果某節點需要進行左旋轉平衡時(right heavy),檢查其右子節點的平衡因子,若右子節點為left heavy,先對右子節點右旋轉,然后對該節點左旋轉
2,如果某節點需要進行右旋轉平衡時(left heavy),檢查其左子節點的平衡因子,若左子節點為right heavy,先對左子節點左旋轉,然后對該節點右旋轉
AVL tree用python實現的代碼如下:

#coding:utf-8 from binarySearchTree import TreeNode, BinarySearchTree # class AVLTreeNode(TreeNode): # # def __init__(self,*args,**kwargs): # self.balanceFactor = 0 # super(AVLTreeNode,self).__init__(*args,**kwargs) class AVLTree(BinarySearchTree): def _put(self,key,value,currentNode): if currentNode.key<key: if currentNode.hasRightChild(): self._put(key,value,currentNode.rightChild) else: currentNode.rightChild=TreeNode(key,value,parent=currentNode) self.updateBalance(currentNode.rightChild) elif currentNode.key>key: if currentNode.hasLeftChild(): self._put(key,value,currentNode.leftChild) else: currentNode.leftChild=TreeNode(key,value,parent=currentNode) self.updateBalance(currentNode.leftChild) else: currentNode.replaceNodeData(key,value) def _del(self,currentNode): if currentNode.isLeaf(): if currentNode.isLeftChild(): currentNode.parent.leftChild = None currentNode.parent.balanceFactor -=1 elif currentNode.isRightChild(): currentNode.parent.rightChild = None currentNode.parent.balanceFactor += 1 if currentNode.parent.balanceFactor>1 or currentNode.parent.balanceFactor<-1: self.reblance(currentNode.parent) elif currentNode.hasBothChildren(): successor = currentNode.findSuccessor() #此處successor為其右子樹的最小值,即最左邊的值 # 先更新parent的balanceFactor if successor.isLeftChild(): successor.parent.balanceFactor -= 1 elif successor.isRightChild(): successor.parent.balanceFactor += 1 successor.spliceOut() currentNode.key = successor.key currentNode.value = successor.value # 刪除后,再判斷是否需要再平衡,然后進行再平衡操作 if successor.parent.balanceFactor>1 or successor.parent.balanceFactor<-1: self.reblance(successor.parent) elif currentNode.hasAnyChildren(): #先更新parent的balanceFactor if currentNode.isLeftChild(): currentNode.parent.balanceFactor -= 1 elif currentNode.isRightChild(): currentNode.parent.balanceFactor += 1 if currentNode.hasLeftChild(): if currentNode.isLeftChild(): currentNode.parent.leftChild = currentNode.leftChild currentNode.leftChild.parent = currentNode.parent elif currentNode.isRightChild(): currentNode.parent.rightChild = currentNode.leftChild currentNode.leftChild.parent = currentNode.parent else: # currentNode has no parent (is root) currentNode.replaceNodeData(currentNode.leftChild.key, currentNode.leftChild.value, currentNode.leftChild.leftChild, currentNode.leftChild.rightChild) elif currentNode.hasRightChild(): if currentNode.isLeftChild(): currentNode.parent.leftChild = currentNode.rightChild currentNode.rightChild.parent = currentNode.parent elif currentNode.isRightChild(): currentNode.parent.rightChild = currentNode.rightChild currentNode.rightChild.parent = currentNode.parent else: # currentNode has no parent (is root) currentNode.replaceNodeData(currentNode.rightChild.key, currentNode.rightChild.value, currentNode.rightChild.leftChild, currentNode.rightChild.rightChild) #刪除后,再判斷是否需要再平衡,然后進行再平衡操作 if currentNode.parent!=None: #不是根節點 if currentNode.parent.balanceFactor>1 or currentNode.parent.balanceFactor<-1: self.reblance(currentNode.parent) def updateBalance(self,node): if node.balanceFactor>1 or node.balanceFactor<-1: self.reblance(node) return if node.parent!=None: if node.isLeftChild(): node.parent.balanceFactor +=1 elif node.isRightChild(): node.parent.balanceFactor -=1 if node.parent.balanceFactor!=0: self.updateBalance(node.parent) def reblance(self,node): if node.balanceFactor>1: if node.leftChild.balanceFactor<0: self.rotateLeft(node.leftChild) self.rotateRight(node) elif node.balanceFactor<-1: if node.rightChild.balanceFactor>0: self.rotateRight(node.rightChild) self.rotateLeft(node) def rotateLeft(self,node): newroot = node.rightChild node.rightChild = newroot.leftChild if newroot.hasLeftChild(): newroot.leftChild.parent = node newroot.parent = node.parent if node.parent!=None: if node.isLeftChild(): node.parent.leftChild = newroot elif node.isRightChild(): node.parent.rightChild = newroot else: self.root = newroot newroot.leftChild = node node.parent = newroot node.balanceFactor = node.balanceFactor+1-min(newroot.balanceFactor,0) newroot.balanceFactor = newroot.balanceFactor+1+max(node.balanceFactor,0) def rotateRight(self,node): newroot = node.leftChild node.leftChild = newroot.rightChild if newroot.rightChild!=None: newroot.rightChild.parent = node newroot.parent = node.parent if node.parent!=None: if node.isLeftChild(): node.parent.leftChild = newroot elif node.isRightChild(): node.parent.rightChild = newroot else: self.root = newroot newroot.rightChild = node node.parent = newroot node.balanceFactor = node.balanceFactor-1-max(newroot.balanceFactor,0) newroot.balanceFactor = newroot.balanceFactor-1+min(node.balanceFactor,0) if __name__ == '__main__': mytree = AVLTree() mytree[8]="red" mytree[4]="blue" mytree[6]="yellow" mytree[5]="at" mytree[9]="cat" mytree[11]="mat" print(mytree[6]) print(mytree[5]) print '-'*12 print ('key','value','balanceFactor') for x in mytree: print x print 'root:',mytree.root.key del mytree[6] print '-'*12 print ('key','value','balanceFactor') for x in mytree: print x print 'root:',mytree.root.key
AVL Tree繼承了二叉搜索樹,對其插入和刪除方法進行了重寫,另外對TreeNode增加了balanceFactor屬性。再進行左旋轉和右旋轉時,對於balanceFactor的需要計算一下,如圖的左旋轉過程中,D成為了新的根節點,只有B和D的平衡因子發生了變化,需要對其進行更新。(右旋轉和左旋轉類似)
B的平衡因子計算過程如下:(newBal(B)為左旋轉后B的平衡因子,oldBal(B)為原來的節點B的平衡因子,h為節點的高度)
D的平衡因子計算過程如下:
由於AVL Tree總是保持平衡,其put和get操作的復雜度能保持為O(log n)
7.總結
到目前為止,對於map(字典)數據結構,用二叉搜索樹和AVL樹實現了,也用有序列表和哈希表實現過,對應操作的復雜度如下:
8. 其他樹形結構
8.1 哈夫曼樹及哈夫曼編碼
參考:http://www.cnblogs.com/mcgrady/p/3329825.html
哈夫曼樹:哈夫曼樹是一種帶權路徑長度最短的二叉樹,也稱為最優二叉樹。 (權:葉子節點的權重;路徑:根節點到葉子節點經過的線段)
下圖中的帶權路徑長度分別為:
圖a: WPL=5*2+7*2+2*2+13*2=54
圖b: WPL=5*3+2*3+7*2+13*1=48
可見,圖b的帶權路徑長度較小,我們可以證明圖b就是哈夫曼樹(也稱為最優二叉樹)。
構建哈夫曼樹步驟:
1,將所有左,右子樹都為空的作為根節點。
2,在森林中選出兩棵根節點的權值最小的樹作為一棵新樹的左,右子樹,且置新樹的附加根節點的權值為其左,右子樹上根節點的權值之和。注意,左子樹的權值應小於右子樹的權值。
3,從森林中刪除這兩棵樹,同時把新樹加入到森林中。
4,重復2,3步驟,直到森林中只有一棵樹為止,此樹便是哈夫曼樹。
下面是構建哈夫曼樹的圖解過程:
哈夫曼編碼:利用哈夫曼樹求得的用於通信的二進制編碼稱為哈夫曼編碼。樹中從根到每個葉子節點都有一條路徑,對路徑上的各分支約定指向左子樹的分支表示”0”碼,指向右子樹的分支表示“1”碼,取每條路徑上的“0”或“1”的序列作為各個葉子節點對應的字符編碼,即是哈夫曼編碼。
上圖A,B,C,D對應的哈夫曼編碼分別為:111,10,110,0。 用圖說明如下:
利用哈夫曼樹編碼字符竄和解碼: 首先統計字符竄中每個字符出現的頻率,以字符頻率為權重建立哈夫曼樹,得到每個字符的哈夫曼碼,最后對字符竄編碼。下面代碼利用哈夫曼樹對字符竄進行了編碼和解碼

#哈夫曼樹節點 class HaffmanNode(object): def __init__(self,value=None,weight=None,leftchild=None,rightchild=None): #value為統計字符,weight為字符出現頻率 self.value = value self.weight = weight self.leftchild=leftchild self.rightchild = rightchild def is_leaf(self): #判斷是否為葉子節點 return not self.leftchild and not self.rightchild def __lt__(self,other): #用於兩個對象間大小比較 return self.weight<other.weight #根據哈夫曼樹獲得哈夫曼碼 def get_haffman_code(root,code,code_dict1,code_dict2): if root.is_leaf(): code_dict1[root.value]=code #進行編碼時使用 code_dict2[code]=root.value #進行解碼時使用 else: get_haffman_code(root.leftchild, code+'0',code_dict1,code_dict2) get_haffman_code(root.rightchild, code+'1',code_dict1,code_dict2) #根據字符頻率構建哈夫曼樹 import heapq def build_haffman_tree(weight_dict): hp=[] for value,weight in weight_dict.items(): #value為字符,weight為字符出現頻率 heapq.heappush(hp,HaffmanNode(value,weight)) while len(hp)>1: left = heapq.heappop(hp) right = heapq.heappop(hp) parent = HaffmanNode(weight=left.weight+right.weight,leftchild=left,rightchild=right) heapq.heappush(hp,parent) return hp[0] #剩下最后元素即為haffman tree weight_dict = {} code_dict1={} code_dict2={} #對字符竄astr進行哈夫曼編碼 def haff_encode(astr): for i in astr: if i not in weight_dict: weight_dict[i]=1 else: weight_dict[i]+=1 haffman_tree = build_haffman_tree(weight_dict) get_haffman_code(haffman_tree,'',code_dict1,code_dict2) encoded_astr = '' for i in astr: encoded_astr+=code_dict1[i] return encoded_astr #解碼哈夫曼編碼后的字符竄 def haff_decode(encoded_astr,code_dict2): code = '' astr='' for i in encoded_astr: code = code+i if code in code_dict2: astr+=code_dict2[code] code='' return astr astr="This is my big fancy house" encoded_astr=haff_encode(astr) print(encoded_astr) decoded_astr = haff_decode(encoded_astr,code_dict2) print(decoded_astr)
利用哈夫曼樹壓縮文件和解壓縮:
參考:https://www.jianshu.com/p/4cbbfed4160b
https://github.com/gg-z/huffman_coding
https://gist.github.com/Arianxx/603dc688a4b68f207ada2c4534758637
8.2 Trie樹(字典樹)
Trie樹:又稱字典樹或前綴樹,儲存單詞字符,方便用來進行詞頻統計和前綴匹配。Trie tree如圖所示:
Trie樹的特點:
除根節點外每個節點都包含字符
從根節點到葉子節點路徑上的字符組成一個完成單詞,
多個單詞的共同路徑節點即為公共前綴
Trie作用:
節約儲存內存;
前綴匹配時,搜索更快,時間復雜度為O(n), (n為單詞的長度)
下面代碼用python實現了一個簡單Trie Tree

#Trie樹,字典樹 class TrieNode(object): def __init__(self,char): self.char = char self.child=[] self.is_leaf = False #是否是葉子節點,即是否為一個完整單詞的最后一個字母 self.counter = 1 #多少單詞有這個共同前綴 class TrieTree(object): def __init__(self): self.root = TrieNode(None) #將一個單詞加入到Trie樹中 def add_trie_word(self,word): root = self.root for char in word: found = False for node in root.child: if node.char==char: node.counter+=1 root = node found = True break if not found: temp = TrieNode(char) root.child.append(temp) root = temp root.is_leaf=True #查找某個單詞前綴是否在Trie樹,並返回有多少個單詞有這個共同前綴 def search_trie_prefix(self,prefix): root = self.root if not root.child: return False,0 for char in prefix: found=False for node in root.child: if node.char==char: root=node found=True break if not found: return False,0 return True,root.counter trie_tree = TrieTree() trie_tree.add_trie_word("hammer") trie_tree.add_trie_word("ham") trie_tree.add_trie_word("had") print(trie_tree.search_trie_prefix("ha")) print(trie_tree.search_trie_prefix("ham")) print(trie_tree.search_trie_prefix("had")) print(trie_tree.search_trie_prefix("b"))
Trie tree參考: https://www.cnblogs.com/huangxincheng/archive/2012/11/25/2788268.html
https://towardsdatascience.com/implementing-a-trie-data-structure-in-python-in-less-than-100-lines-of-code-a877ea23c1a1
參考:http://interactivepython.org/runestone/static/pythonds/Trees/toctree.html