x.sort
和sorted
函數中參數key
的使用
介紹
python中,列表
自帶了排序函數sort
>>> l = [1, 3, 2]
>>> l.sort()
>>> l
[1, 2, 3]
對於其他字典
、元組
、集合
容器,可以使用內置方法sort
來做排序,注意返回的結果是列表結構
, 字典
容器,默認是key
進行排序的。
>>> # tuple sort
>>> t = (1, 3, 2)
>>> sorted(t)
[1, 2, 3]
>>>
>>> # set sort
>>> s = {1, 3, 2}
>>> sorted(s)
[1, 2, 3]
>>>
>>> # dict sort
>>> d = {1:100, 3:200, 2: 0}
>>> sorted(d)
[1, 2, 3]
>>> sorted(d.values())
[0, 100, 200]
>>> sorted(d.items())
[(1, 100), (2, 0), (3, 200)]
>>>
參數key的使用
先看一下sorted
函數的文檔說明
>>> help(sorted)
Help on built-in function sorted in module builtins:
sorted(iterable, /, *, key=None, reverse=False)
Return a new list containing all items from the iterable in ascending order.
A custom key function can be supplied to customize the sort order, and the
reverse flag can be set to request the result in descending order.
參數key
是函數類型,用來支持自定義的排序方式。我們先看一個使用參數key
的場景,比如:有一組員工工資單
Name | salary | age |
---|---|---|
Tom | 4000 | 20 |
Jerry | 4000 | 24 |
Bob | 5000 | 28 |
希望可以按照工資從多到少排序,如果工資一樣,按年齡從小到大排序
>>> salary_list = [
... ("Tom", 4000, 20),
... ("Jerry", 4000, 24),
... ("Bob", 5000, 28),
... ]
>>> # 自定義比較函數,返回值為元組(-salary, age)
>>> def mycmp(salary_item: tuple) -> (int, int):
... return (-salary_item[1], salary_item[2])
...
>>> sorted(salary_list, key=mycmp)
[('Bob', 5000, 28), ('Tom', 4000, 20), ('Jerry', 4000, 24)]
所以,為什么是這樣呢?
我們來實現一個類CmpObject
,在它的一些魔法方法中加入調試信息,來看看sorted
函數是什么進行比較的
class CmpObject:
def __init__(self, name, val):
self.name = name
self.val = val
def __neg__(self):
print("called __neg__", self.name, self.val)
return CmpObject(self.name, -self.val)
def __eq__(self, other):
print("called __eq__")
return self.get_val() == other.get_val()
def __lt__(self, other):
print("called __lt__")
return self.get_val() < other.get_val()
def __gt__(self, other):
print("called __gt__")
return self.get_val() > other.get_val()
def get_val(self):
print("called get_val", self.name, self.val)
return self.val
def __repr__(self):
return f"{self.val}"
>>> # 初始化工資單
>>> salary_list = [
... ("Tom", CmpObject("Tom", 4000), CmpObject("Tom", 20)),
... ("Jerry", CmpObject("Jerry", 4000), CmpObject("Jerry", 24)),
... ("Bob", CmpObject("Bob", 5000), CmpObject("Bob", 28)),
... ]
>>> salary_list
[('Tom', 4000, 20), ('Jerry', 4000, 24), ('Bob', 5000, 28)]
>>> # 還是用原來的mycmp比較函數
>>> def mycmp(salary_item: tuple) -> (int, int):
... return (-salary_item[1], salary_item[2])
...
>>> # 執行排序
>>> sorted(salary_list, key=mycmp)
# 分析一下比較順序
# sorted函數會先變量保存用於比較的key
called __neg__ Tom 4000
called __neg__ Jerry 4000
called __neg__ Bob 5000
# 新比較 -4000 -4000, 即Jerry和Tom的工資,因為是按工資倒序排(工資多的在前),所以比較的是工資的負數
called __eq__
called get_val Jerry -4000
called get_val Tom -4000
# 發現Jerry和Tom的工資相同,再去比較他們的年齡
called __eq__
called get_val Jerry 24
called get_val Tom 20
# 發現Jerry的年齡(24)大於Tom的年齡(20),Jerry要排在Tom后面
called __lt__
called get_val Jerry 24
called get_val Tom 20
# 然后比較 Bob和Jerry的工資,Bob的工資比Jerry的工資高,Jerry要排在Bob后面
called __eq__
called get_val Bob -5000
called get_val Jerry -4000
called __lt__
called get_val Bob -5000
called get_val Jerry -4000
# 又比較了 Bob和Jerry
called __eq__
called get_val Bob -5000
called get_val Jerry -4000
called __lt__
called get_val Bob -5000
called get_val Jerry -4000
# 最后比較Bob和Tom, 發現Bob的工資比Tom多,Tom要排在Bob后面
called __eq__
called get_val Bob -5000
called get_val Tom -4000
called __lt__
called get_val Bob -5000
called get_val Tom -4000
# 最后結果 Bob > Tom > Jerry
[('Bob', 5000, 28), ('Tom', 4000, 20), ('Jerry', 4000, 24)]
結論
1.sorted
函數的排序策略是對於比較用的key,如果是元組形式,從左到右權重遞減,(-工資, 年齡)
,如果-工資
相同,則再比較年齡
。
2.比較是用使用__eq__
和__lt__
進行比較,如果__eq__
為真,再比較__lt__
,如果__lt__
為真,則要調換順序;否則,不變。