Python序列删除重复数据


## 对于列表来说,若不保持原有顺序,可以直接转换为set删除重复数据

1 nums = [1,2,32,2,2,4,3,2,3,42]
2 nums = list(set(nums))
3 print(nums)
4 # [32, 1, 2, 3, 4, 42]  # 删除了重复数据,但是原有顺序也改变了

 

## 删除数据并保持原有顺序

 1 def dedupe(items, key=None):
 2     """
 3     items: 哈希或者不可哈希的序列
 4     key: 若items为不可哈希的序列(dict等)则需要指定一个函数
 5     """
 6     seen = set()
 7     for item in items:
 8         val = item if key is None else key(item)
 9         if val not in seen:
10             yield item
11             seen.add(val)
12 
13 nums = [1,2,32,2,2,4,3,2,3,42]
14 print(list(dedupe(nums)))
15 # [1, 2, 32, 4, 3, 42]
16 
17 students = [
18     {"name": "Stanley", "score": 88},
19     {"name": "Lily", "score": 92},
20     {"name": "Bob", "score": 91},
21     {"name": "Well", "score": 80},
22     {"name": "Bob", "score": 90},
23     {"name": "Peter", "score": 80}
24 ]
25 deduped_students = list(dedupe(students, key=lambda s: s['name']))
26 print(deduped_students)
27 """
28 [{'name': 'Stanley', 'score': 88},
29 {'name': 'Lily', 'score': 92},
30 {'name': 'Bob', 'score': 91},
31 {'name': 'Well', 'score': 80},
32 {'name': 'Peter', 'score': 80}]   # 删除了相同姓名的元素
33 """
34 # 删除姓名和分数都相同的元素
35 deduped_students = list(dedupe(students, key=lambda s: (s['name'], s['score'])))

 

参考资料:
  Python Cookbook, 3rd edition, by David Beazley and Brian K. Jones (O’Reilly).


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM