Python如何高效地統計數據的頻率?


本文來自知乎轉載~
作者:聞波
鏈接:https://www.zhihu.com/question/27800240/answer/122682289
來源:知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。

 1 作者:聞波
 2 鏈接:https://www.zhihu.com/question/27800240/answer/122682289
 3 來源:知乎
 4 著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。
 5 
 6 import collections
 7 import numpy as np
 8 import random
 9 import time
10 
11 
12 def list_to_dict(lst):
13     dic = {}
14     for i in lst:
15         dic[i] = lst.count(i)
16     return dic
17 
18 
19 def collect(lst):
20     return dict(collections.Counter(lst))
21 
22 
23 def unique(lst):
24     return dict(zip(*np.unique(lst, return_counts=True)))
25 
26 
27 def generate_data(num=1000000):
28     return np.random.randint(num / 10, size=num)
29 
30 
31 if __name__ == "__main__":
32     t1 = time.time()
33     lst = list(generate_data())
34     t2 = time.time()
35     print("generate_data took : %sms" % (t2 - t1))  # 本機實測0.12ms
36 
37     t1 = t2
38     d1 = unique(lst)
39     t2 = time.time()
40     print("unique took : %sms" % (t2 - t1))  # 本機實測0.42ms
41 
42     t1 = t2
43     d2 = collect(lst)
44     t2 = time.time()
45     print("collect took : %sms" % (t2 - t1))  # 本機實測1.25ms
46 
47     t1 = t2
48     d3 = list_to_dict(lst)
49     t2 = time.time()
50     print("list_to_dict took : %sms" % (t2 - t1))  # 本機實測...太慢了測不下去了
51 
52     assert(d1 == d2)
53     assert(d1 == d3)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM