Perl-統計文本中各個單詞出現的次數（NVDIA2019筆試）

本文轉載自查看原文 2020-02-26 20:41 733 找工作-手撕代碼系列/ Perl/ 哈希/ 替換

1、原題

2、perl腳本

print "================ Method 1=====================\n";
open IN,'<','anna-karenina.txt';
while(<IN>){
        chomp;  
        $line = $_;
        $line =~ s/[ \. , ? ! ; : ' " ( ) { }  \[ \]]/ /g; #句號，逗號等統一改為空格
        #print("$line\n");
        @words = split(/\s+/,$line);
        foreach $word (@words){
                $counts{lc($word)}++;  #將出現的單詞存入hash表
        }
};


foreach $word (sort keys %counts) {
        print "$word,$counts{$word}\n";  #打印出單詞出現的個數
}
close IN;


print "================ Method 2=====================\n";
open IN,'<','anna-karenina.txt';
while (my $line = <IN>)
{
        #map{$words{$_}++;} $line =~ /(\w+)/g   # 與下面的語句等效

        #print($line =~ /(\w+)/g);
        foreach ($line =~ /(\w+)/g){   # 對單詞進行匹配
                #print("$_\n");
                $words{lc($_)}++;
        }
}
for (sort keys(%words))
{
    print "$_: $words{$_}\n";
}

3、結果

1）測試文本

All happy families resemble one another; every unhappy family is unhappy in its own way.
All was confusion in the house of Oblonskys. happy? happy: [happy] {happy} "happy" 'happy'

2）輸出

================ Method 1=====================
all,2
another,1
confusion,1
every,1
families,1
family,1
happy,7
house,1
in,2
is,1
its,1
oblonskys,1
of,1
one,1
own,1
resemble,1
the,1
unhappy,2
was,1
way,1
================ Method 2=====================
all: 2
another: 1
confusion: 1
every: 1
families: 1
family: 1
happy: 7
house: 1
in: 2
is: 1
its: 1
oblonskys: 1
of: 1
one: 1
own: 1
resemble: 1
the: 1
unhappy: 2
was: 1
way: 1

4、涉及的知識點

1）對多個項目進行替換可以使用方括號：

　　$line =~ s/[ \. , ? ! ; : ' " ( ) { } \[ \]]/ /g; #句號，逗號等統一改為空格

2）將單詞小寫lc，用哈希計數

　　$counts{lc($word)}++; #將出現的單詞存入hash表

3）訪問哈希整體%，訪問哈希鍵值keys %，排序sort

　　sort keys %counts

4）方法2使用 $line =~ /(\w+)/g 直接將文本中的單詞轉換成列表

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SV 類繼承的多態性問題（NVDIA2019筆試）統計英文文本中每個單詞的出現次數 Python: 統計一個文件中每個單詞出現的次數 linux shell 統計文件單詞出現次數統計文件中出現的單詞次數【面試題總結】1、統計字符串中某個單詞出現的次數(1-C++實現) N個任務掌握java系列之統計一篇文章中單詞出現的次數 Scala統計一個文件所有單詞出現的次數 python 統計字符串每個單詞出現的次數統計英文文檔里每個單詞出現的次數