Python統計list中各個元素出現的次數

本文轉載自查看原文 2020-10-06 16:25 9335

列表count()函數調用方法

對象.count(參數)

count()方法操作示例

有列表['a','iplaypython.com','c','b‘,'a']，想統計字符串'a'在列表中出現的次數，可以這樣操作

>>> ['a','iplaypython.com','c','b','a'].count('a')

其返回值就是要統計參數出現的次數。在應用的時候最好是把列表賦給一個變量，之后再用count()方法來操作比較好。

當對象是一個嵌套的列表時，要查找嵌套列表中的列表參數count()方法同樣可以完成

>>> x = [1,2,'a',[1,2],[1,2]]

>>> x.count([1,2])

>>> x.count(1)

>>> x.count('a')

1. 計算字母和數字出現的次數

str='abc123abc456aa'
d={}
for x in str:
print x
if not x in d:
d[x]=1
else:
d[x]=d[x]+1

print d

{'a': 4, 'c': 2, 'b': 2, '1': 1, '3': 1, '2': 1, '5': 1, '4': 1, '6': 1}

#!/usr/bin/python3

str="ABCdefabcdefabc"
str=str.lower()
str_list=list(str)
char_dict={}

for char1 in str:
if char1 in char_dict:
count=char_dict[char1]
else:
count=0
count=count+1
char_dict[char1]=count
print(char_dict)

a = "aAsmr3idd4bgs7Dlsf9eAF"

請將a字符串的數字取出，並輸出成一個新的字符串。

請統計a字符串出現的每個字母的出現次數（忽略大小寫，a與A是同一個字母），並輸出成一個字典。例 {'a':3,'b':1}

請去除a字符串多次出現的字母，僅留最先出現的一個,大小寫不敏感。例 'aAsmr3idd4bgs7Dlsf9eAF'，經過去除后，輸出 'asmr3id4bg7lf9e'

a = "aAsmr3idd4bgs7Dlsf9eAF"

def fun1_2(x): #1&2

    x = x.lower() #大小寫轉換

    num = []

    dic = {}

    for i in x:

        if i.isdigit():  #判斷如果為數字，請將a字符串的數字取出，並輸出一個新的字符串

            num.append(i)

        else:   #2 請統計a字符串出現每個字母的出現次數（忽視大小寫），並輸出一個字典。例：{'a':3,'b':1}

            if i in dic:
                        continue
            else:
                dic[i] = x.count(i)  

    new = ''.join(num)

    print "the new numbers string is: " + new

    print "the dictionary is: %s" % dic

fun1_2(a)

def fun3(x):

    x = x.lower()

    new3 = []

    for i in x:

        if i in new3:

                continue
        else:
            new3.append(i)

    print ''.join(new3)

fun3(a)

三種方法：

①直接使用dict

②使用defaultdict

③使用Counter

ps:`int()`函數默認返回0

①dict

1. text = "I'm a hand some boy!"
2.
3. frequency = {}
4.
5. for word in text.split():
6. if word not in frequency:
7. frequency[word] = 1
8. else:
9. frequency[word] += 1

②defaultdict

1. import collections
2.
3. frequency = collections.defaultdict(int)
4.
5. text = "I'm a hand some boy!"
6.
7. for word in text.split():
8. frequency[word] += 1

③Counter

1. import collections
2.
3. text = "I'm a hand some boy!"
4. frequency = collections.Counter(text.split())

現有列表如下：
[6, 7, 5, 9, 4, 1, 8, 6, 2, 9]
希望統計各個元素出現的次數，可以看作一個詞頻統計的問題。
我們希望最終得到一個這樣的結果：{6:2, 7:1...}即 {某個元素：出現的次數...}
首先要將這些元素作為字典的鍵，建立一個初值為空的字典：

>>> from random import randint

>>> l = [randint(1,10) for x in xrange(10)]

>>> l
[6, 7, 5, 9, 4, 1, 8, 6, 2, 9]

>>> d = dict.fromkeys(l, 0)
>>> d
{1: 0, 2: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}

# 現在的任務是需要將d中每個鍵所對應的值統計出來
>>> for x in l:
>>> d[x] += 1
>>> d

{1: 1, 2: 1, 4: 1, 5: 1, 6: 2, 7: 1, 8: 1, 9: 2}

# 這就統計完了所有的元素出現的次數

另外一種方法，利用collections模塊中的Counter對象

>>> from collections import Counter

# 這個Counter可以直接接受一個列表，將它轉化為統計完成的結果

>>> d = Counter(l)
>>> d
Counter({6: 2, 9: 2, 1: 1, 2: 1, 4: 1, 5: 1, 7: 1, 8: 1})
# 該Counter對象是字典對象的子類，也可以通過鍵來訪問對應值
>>> d[6]
2
# Counter對象方便之處在於它內置有most_common(n)方法，可以直接統計出前n個最高詞頻
>>> d.most_common(2)
[(6, 2), (9, 2)]

用python做詞頻統計

import string
import time

path='C:\\Users\\ZHANGSHUAILING\\Desktop\\Walden.txt'

with open(path,'r') as text:
words=[raw_word.strip(string.punctuation).lower() for raw_word in text.read().split()]
words_index=set(words)
counts_dict={index:words.count(index) for index in words_index}
for word in sorted(counts_dict,key=lambda x:counts_dict[x],reverse=True):
time.sleep(2)
print ('{}--{} times'.format(word,counts_dict[word]))

{'the': 2154, 'and': 1394, 'to': 1080, 'of': 871, 'a': 861, 'his': 639, 'The': 637, 'in': 515, 'he': 461, 'with': 310, 'that': 308, 'you': 295, 'for': 280, 'A': 269, 'was': 258, 'him': 246, 'I': 234, 'had': 220, 'as': 217, 'not': 215, 'by': 196, 'on': 189, 'it': 178, 'be': 164, 'at': 153, 'from': 149, 'they': 149, 'but': 149, 'is': 144, 'her': 144, 'their': 143, 'who': 131, 'all': 121, 'one': 119, 'which': 119,}#部分結果展示

import re,collections
def get_words(file):
with open (file) as f:
words_box=[]
for line in f:
if re.match(r'[a-zA-Z0-9]*',line):#避免中文影響
words_box.extend(line.strip().split())
return collections.Counter(words_box)
print(get_nums('emma.txt')+get_nums('伊索寓言.txt'))

import re,collections
def get_words(file):
with open (file) as f:
words_box=[]
for line in f:
if re.match(r'[a-zA-Z0-9]',line):
words_box.extend(line.strip().split())
return collections.Counter(words_box)
a=get_nums('emma.txt')+get_nums('伊索寓言.txt')
print(a.most_common(10))

python 計數方法小結

方法一：遍歷法

def get_counts(sequence):
counts = {}
for x in sequence:
if x in counts:
counts[x] += 1
else:
counts[x] = 1
return counts
這是最常規的方法，一個一個數咯

方法二： defaultdict

這里用到了coollections 庫

from collections import defaultdict

def get_counts2(sequence):
counts = defaultdict(int) #所有值被初始化為0
for x in sequence:
counts[x] += 1
return counts
最后得到的是元素：個數的一個字典

方法三：value_counts()

這個方法是pandas 中的，所以使用時候需要先導入pandas ,該方法會對元素計數，並按從大到小的順序排列

tz_counts = frame['tz'].value_counts()
tz_counts[:10]

>>>
America/New_York 1251
521
America/Chicago 400
America/Los_Angeles 382
America/Denver 191
Europe/London 74
Asia/Tokyo 37
Pacific/Honolulu 36
Europe/Madrid 35
America/Sao_Paulo 33
Name: tz, dtype: int64
我們看一下官方文檔中的說明

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)[source]?
Returns object containing counts of unique values.
1
2
這里說明一下返回的數據是Series 格式的

總的來說方法一最為普通如果數據量比較大的話非常費時間，方法三對數據的格式有要求，所以推薦使用方法二

python - 統計一個字符串中的每一個字符出現了多少次(先將字符串轉換為列表再統計)

#coding=utf-8

#統計一個字符串中的每一個字符出現了多少次

#定義一個字符串
str = 'abbcccdddd'

#在字符串的每一個字符之間插入一個空格組成一個新的字符串

str = ' '.join(str)

#打印新的字符串看看

print('str = ',str)

#將新字符串按空格分割成一個列表

li = str.split(' ')

#打印新的列表

print('li = ',li)

#統計每一個字符出現的次數:

#方式一

for i in set(li):
if li.count(i) >= 1:
print('%s 出現了%d 次!'%(i, li.count(i)))

print('*'*50)

#方式二

from collections import Counter
res = Counter(li)
print(res)

運行結果:

('str = ', 'a b b c c c d d d d')
('li = ', ['a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd'])
a 出現了1 次!
c 出現了3 次!
b 出現了2 次!
d 出現了4 次!
**************************************************
Counter({'d': 4, 'c': 3, 'b': 2, 'a': 1})

Python count() 方法用於統計字符串里某個字符出現的次數。可選參數為在字符串搜索的開始與結束位置。

count()方法語法：

str.count(sub, start= 0,end=len(string))

參數
sub -- 搜索的子字符串
start -- 字符串開始搜索的位置。默認為第一個字符,第一個字符索引值為0。
end -- 字符串中結束搜索的位置。字符中第一個字符的索引為 0。默認為字符串的最后一個位置。

返回值

該方法返回子字符串在字符串中出現的次數。

#!/usr/bin/python

str = "this is string example....wow!!!";

sub = "i";

print "str.count(sub, 4, 40) : ", str.count(sub, 4, 40)

sub = "wow";

print "str.count(sub) : ", str.count(sub)

#定義統計字符串的方法

def calcu_sub_str_num(mom_str,sun_str):
print('打印母字符串：',mom_str) #打印出母字符串
print( '打印子字符串：',sun_str) #打印出子字符串
print('打印母字符串長度：',len(mom_str)) #打印出母字符串長度
print( '打印子字符串長度：',len(sun_str)) #打印出子字符串長度
count = 0 #定義計數器初始值
#使用循環遍歷字符串，第一次循環，通過切片獲取下標從0開始與子字符串長度一致的字符串，並與字符串比較，如果等於子字符串count+1
#第二次循環，通過切片獲取下標從1開始與子字符串長度一致的字符串，並與字符串比較，如果等於子字符串則count+1，以此類推直到遍歷完成
for i in range(len(mom_str)-1): #因為i的下標從0開始，所以len（mom_str）-1
if mom_str[i:i+len(sun_str)] == sun_str:
count+=1
return count

mom_str = input('please input mother string:') #使用input獲取輸入母字符串
sun_str = input('please input child string:') #使用input獲取輸入子字符串
print('子字符串在母字符串中出現的次數：%d'%calcu_sub_str_num(mom_str,sun_str))#%d為數字占位符

例8：使用zip方法構建元素為元組的列表

In [91]: zip('xyz','123')

Out[91]: [('x', '1'), ('y', '2'), ('z','3')]

In [92]: zip('xyz','1234')

Out[92]: [('x', '1'), ('y', '2'), ('z','3')]

In [93]: zip('xyzm','564')

Out[93]: [('x', '5'), ('y', '6'), ('z','4')]

In [94]: zip('xyz','123','abc')

Out[94]: [('x', '1', 'a'), ('y', '2', 'b'),('z', '3', 'c')]

例9：使用dict(zip())快速構建字典

In [95]: dict(zip('xyz','123'))

Out[95]: {'x': '1', 'y': '2', 'z': '3'}
---------------------

1) 使用字典dict()

循環遍歷出一個可迭代對象中的元素,如果字典沒有該元素,那么就讓該元素作為字典的鍵,並將該鍵賦值為1,如果存在就將該元素對應的值加1.

lists = ['a','a','b',5,6,7,5]
count_dict = dict()
for item in lists:
if item in count_dict:
count_dict[item] += 1
else:
count_dict[item] = 1

2) 使用defaultdict()

defaultdict(parameter)可以接受一個類型參數,如str,int等,但傳遞進來的類型參數，不是用來約束值的類型，更不是約束鍵的類型，而是當鍵不存在的話,實現一種值的初始化

defaultdict(int)：初始化為 0
defaultdict(float)：初始化為 0.0
defaultdict(str)：初始化為 ”
from collections import defaultdict
lists = ['a', 'a', 'b', 5, 6, 7, 5]
count_dict = defaultdict(int)
for item in lists:
count_dict[item] += 1

3)使用集合(set)和列表(list)

先使用set去重,然后循環的把每一個元素和每一個元素對應的次數lists.count(item)組成一個元組放在列表里面

lists = ['a', 'a', 'b', 5, 6, 7, 5]
count_set = set(lists)
count_list = list()
for item in count_set:
count_list.append((item,lists.count(item)))

4)使用Counter

Counter是一個容器對象,主要的作用是用來統計散列對象,可以使用三種方式來初始化

參數里面參數可迭代對象 Counter("success")

傳入關鍵字參數Counter((s=3,c=2,e=1,u=1))

傳入字典 Counter({"s":3,"c"=2,"e"=1,"u"=1})

Counter()對象還有幾個可以調用的方法,代碼里面分別進行了說明

from collections import Counter
lists = ['a', 'a', 'b', 5, 6, 7, 5]
a = Counter(lists)
print(a) # Counter({'a': 2, 5: 2, 'b': 1, 6: 1, 7: 1})

a.elements() # 獲取a中所有的鍵,返回的是一個對象,我們可以通過list來轉化它

a.most_common(2) # 前兩個出現頻率最高的元素已經他們的次數,返回的是列表里面嵌套元組

a['zz'] # 訪問不存在的時候,默認返回0

a.update("aa5bzz") # 更新被統計的對象,即原有的計數值與新增的相加,而不是替換

a.subtrct("aaa5z") # 實現與原有的計數值相減,結果運行為0和負值

利用Python字典統計
利用Python的collection包下Counter類統計
利用Python的pandas包下的value_counts類統計

字典統計

a = [1, 2, 3, 1, 1, 2]
dict = {}
for key in a:
    dict[key] = dict.get(key, 0) + 1
print(dict)

collection包下Counter類統計

from collections import Counter
a = [1, 2, 3, 1, 1, 2]
result = Counter(a)
print(result)

pandas包下的value_counts方法統計

import pandas as pd
a = pd.DataFrame([[1,2,3],
                  [3,1,3],
                  [1,2,1]])
result = a.apply(pd.value_counts)
print(result)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 用python統計list中各元素出現的次數（同理統計字符串中各字符出現的次數） python3--統計List、字符串某元素出現次數 Java統計List集合中每個元素出現的次數 python 列表元素統計出現的次數並輸出字典 python 數組的操作--統計某個元素在列表中出現的次數 Python中用dict統計列表中元素出現的次數 QT中統計數組中各元素出現的次數統計數組中各個元素出現的次數，並按照次數從大到小排序統計字符串中字符出現的次數(Python版) python之統計字符串中字母出現次數