7-4 jmu-Java&Python-統計文字中的單詞數量並按出現次數排序 (25分)


現在需要統計若干段文字(英文)中的單詞數量,並且還需統計每個單詞出現的次數。

注1:單詞之間以空格(1個或多個空格)為間隔。
注2:忽略空行或者空格行。

基本版:
統計時,區分字母大小寫,且不刪除指定標點符號。

進階版:

統計前,需要從文字中刪除指定標點符號! . , : * ?。 注意:所謂的刪除,就是用1個空格替換掉相應字符。
統計單詞時需要忽略單詞的大小寫。
輸入說明
若干行英文,最后以!!!!!為結束。

輸出說明

單詞數量
出現次數排名前10的單詞(次數按照降序排序,如果次數相同,則按照鍵值的字母升序排序)及出現次數。

輸入樣例1

failure is probably the fortification in your pole

it is like a peek your wallet as the thief when you
are thinking how to spend several hard-won lepta

when you are wondering whether new money it has laid
background because of you then at the heart of the

most lax alert and most low awareness and left it

godsend failed
!!!!!

輸出樣例1

46
the=4
it=3
you=3
and=2
are=2
is=2
most=2
of=2
when=2
your=2

輸入樣例2

Failure is probably The fortification in your pole!

It is like a peek your wallet as the thief when You
are thinking how to. spend several hard-won lepta.

when yoU are? wondering whether new money it has laid
background Because of: yOu?, then at the heart of the
Tom say: Who is the best? No one dare to say yes.
most lax alert and! most low awareness and* left it

godsend failed
!!!!!

輸出樣例2

54
the=5
is=3
it=3
you=3
and=2
are=2
most=2
of=2
say=2
to=2

解題過程:

python3中移除了sort中的cmp,利用cmp_to_key及自訂函數進行排序

python代碼:

from functools import cmp_to_key

def cmpkey2(x,y):
    if x[1]>y[1]:
        return  1
    elif x[1]<y[1]:
        return  -1
    elif x[0]>y[0]:
        return  -1
    elif x[0]<y[0]:
        return  1
    return 0

text = ""
while True:
    s = input()
    if s == '!!!!!':
        break
    text += ' '
    text += s
text = text.lower()
for ch in '!.,:*?':
    text=text.replace(ch, ' ')
cnt = {}
for word in text.split():
    cnt[word] = cnt.get(word, 0) + 1
items = list(cnt.items())
items.sort(key=cmp_to_key(cmpkey2),reverse=True)
print(len(items))
for i in range(10):
    if i>=len(items):
        break
    key, val = items[i]
    print("{}={}".format(key, val))


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM