詞頻統計更新


代碼有兩個分支,1、選擇輸入文本路徑或,2、選擇直接輸入文章

public static void main(String[] args) {
        HashMap<String,Integer> map=new HashMap<String,Integer>();//用於統計各個單詞的個數,排序
        //過濾字符串中的所有標點符號
        String regex=" ?.!:,\"\"'';\n";
        BufferedReader br;
        try {
            //FileReader類創建了一個可以讀取文件內容的Reader類、調用構造方法FileReader()
            Scanner scan = new Scanner(System.in);
            System.out.println("請輸入您的輸入格式");
            System.out.println("1、文件完整路徑");
            System.out.println("2、文章內容");
            int flag = scan.nextInt(); 
            

根據不同的選擇,進入不同的分支

功能1:小文件輸入鍵盤在控制台下輸入命令。

在控制台輸入文本路徑即可進行詞頻統計。

 1                     System.out.println("請輸入文件完整路徑");
 2                     String fileUrl = scan.next();
 3                     br = new BufferedReader(new FileReader(fileUrl));//文件完整路徑
 4                     String sentence;
 5                     int wordCount = 0;
 6                     try {
 7                         while((sentence = br.readLine()) !=null){     //用readLine讀取文件,判斷讀取文件是否為空
 8                             sentence = sentence.replaceAll(regex, "");
 9                             StringTokenizer token=new StringTokenizer(sentence);
10                             while(token.hasMoreTokens()){     //循環遍歷
11                                 wordCount++;    
12                                 String word = token.nextToken();
13                                 if(map.containsKey(word)){     //HashMap不允許重復的key,所以利用這個特性,去統計單詞的個數
14                                 int count=map.get(word);
15                                 map.put(word, count+1);     //如果HashMap已有這個單詞,則設置它的數量加1
16                             }
17                             else{
18                                 map.put(word, 1);          //如果沒有這個單詞,則新填入,數量為1
19                         }
20                     }
21                 }
22                         System.out.println("總共單詞數:"+wordCount);
23                         sort(map); 
24                     } catch (IOException e) {
25                         e.printStackTrace();
26                     }
27                     break;

運行結果:

請輸入您的輸入格式
1、文件完整路徑
2、文章內容
1
請輸入文件完整路徑
c://english.txt
總共單詞數:181
as:7
the:7
not:6
it:6
to:5
are:4
a:4
your:4
in:4
they:3
live:3
and:3
of:2
do:2
may:2
by:2
be:2
clothes:2
that:2
often:2
have:2
from:2
above:2
is:2
you:2
door:1
its:1
suppose.It:1
palace.The:1
contentedly:1
snow:1
friends,Turn:1
yourself:1
means.which:1
or:1
windows:1
life,poor:1
bad:1
quiet:1
like:1
without:1
thoughts.:1
simply:1
abode;the:1
change.Sell:1
will:1
some:1
fault-finder:1
herb,like:1
before:1
most:1
I:1
old,return:1
trouble:1
life:1
change;we:1
supported:1
is.You:1
spring.:1
me:1
mind:1
town;but:1
there,and:1
paradise.Love:1
hardnames.It:1
is,meet:1
should:1
seem:1
independent:1
new:1
alms-house:1
poor-house.The:1
pleasant,thrilling,glorious:1
;do:1
garden:1
happens:1
keep:1
but:1
However:1
reflected:1
being:1
brightly:1
enough:1
Cultivate:1
any.May:1
looks:1
more:1
sage.Do:1
town's:1
when:1
faults:1
richest.The:1
disreputable.:1
think:1
get:1
so:1
much:1
lives:1
perhaps:1
early:1
things,whether:1
call:1
dishonest:1
sun:1
shun:1
melts:1
setting:1
them.Things:1
poverty:1
poorest:1
mean:1
receive:1
find:1
hourss,even:1
thoughts,as:1
rich:1
poor:1
man's:1
cheering:1
great:1
see:1
supporting:1
themselves:1
misgiving.Most:1

功能2. 支持命令行輸入英文作品的文件名

>wf english.txt

total 181 words

功能3. 支持命令行輸入存儲有英文作品文件的目錄名,批量統計。
>dir folder
gone_with_the_wand
runbinson
janelove
>wf folder
gone_with_the_wand
total 1234567 words

功能4. 從控制台讀入英文單篇作品

                    System.out.println("請輸入文章內容");
                    String sentence2 = scan.next();        //將要輸入的句子或段落。
                    System.out.println(sentence2);
                    int wordCount2=0;                    //每個單詞出現的次數。
                    HashMap<String,Integer> map2=new HashMap<String,Integer>();//用於統計各個單詞的個數,排序
                    StringTokenizer token=new StringTokenizer(sentence2);//這個類會將字符串分解成一個個的標記
                    sentence = sentence2.replaceAll(regex, "");
                    while(token.hasMoreTokens()){                      //循環遍歷
                        wordCount2++;                                  
                        String word=token.nextToken(", ?.!:\"\"''\n"); //括號里的字符的含義是說按照,空格 ? . : "" '' \n去分割
                        if(map2.containsKey(word)){     //HashMap不允許重復的key,所以利用這個特性,去統計單詞的個數
                            int count=map2.get(word);
                            map2.put(word, count+1);     //如果HashMap已有這個單詞,則設置它的數量加1
                        }
                        else
                            map2.put(word, 1);          //如果沒有這個單詞,則新填入,數量為1
                    }
                    System.out.println("總共單詞數:"+wordCount2);
                    sort(map2);                        //調用排序的方法,排序並輸出!                    
                
                    break;
            }

運行結果:

  1 請輸入您的輸入格式
  2 1、文件完整路徑
  3 2、文章內容
  4 2
  5 請輸入文章內容
  6 However mean your life is,meet it and live it ;do not shun it and call it hardnames.It is not so bad as you suppose.It looks poorest when you are richest.The fault-finder will find faults in paradise.Love your life,poor as it is.You may perhaps have some pleasant,thrilling,glorious hourss,even in a poor-house.The setting sun is reflected from the windows of the alms-house as brightly as from the rich man's abode;the snow melts before its door as early in the spring. I do not see but a quiet mind may live as contentedly there,and have as cheering thoughts,as in a palace.The town's poor seem to me often to live the most independent lives of any.May be they are simply great enough to receive without misgiving.Most think that they are above being supported by the town;but it often happens that they are not above supporting themselves by dishonest means.which should be more disreputable.Cultivate poverty like a garden herb,like sage.Do not trouble yourself much to get new things,whether clothes or friends,Turn the old,return to them.Things do not change;we change.Sell your clothes and keep your thoughts.
  9 總共單詞數:181
 10 as:7
 11 the:7
 12 not:6
 13 it:6
 14 to:5
 15 are:4
 16 a:4
 17 your:4
 18 in:4
 19 they:3
 20 live:3
 21 and:3
 22 of:2
 23 do:2
 24 may:2
 25 by:2
 26 be:2
 27 clothes:2
 28 that:2
 29 often:2
 30 have:2
 31 from:2
 32 above:2
 33 is:2
 34 you:2
 35 door:1
 36 its:1
 37 suppose.It:1
 38 palace.The:1
 39 contentedly:1
 40 snow:1
 41 friends,Turn:1
 42 yourself:1
 43 means.which:1
 44 or:1
 45 windows:1
 46 life,poor:1
 47 bad:1
 48 quiet:1
 49 like:1
 50 without:1
 51 thoughts.:1
 52 simply:1
 53 abode;the:1
 54 change.Sell:1
 55 will:1
 56 some:1
 57 fault-finder:1
 58 herb,like:1
 59 before:1
 60 most:1
 61 I:1
 62 old,return:1
 63 trouble:1
 64 life:1
 65 change;we:1
 66 supported:1
 67 is.You:1
 68 spring.:1
 69 me:1
 70 mind:1
 71 town;but:1
 72 there,and:1
 73 paradise.Love:1
 74 hardnames.It:1
 75 is,meet:1
 76 should:1
 77 seem:1
 78 independent:1
 79 new:1
 80 alms-house:1
 81 poor-house.The:1
 82 pleasant,thrilling,glorious:1
 83 ;do:1
 84 garden:1
 85 happens:1
 86 keep:1
 87 but:1
 88 However:1
 89 reflected:1
 90 being:1
 91 brightly:1
 92 enough:1
 93 Cultivate:1
 94 any.May:1
 95 looks:1
 96 more:1
 97 sage.Do:1
 98 town's:1
 99 when:1
100 faults:1
101 richest.The:1
102 disreputable.:1
103 think:1
104 get:1
105 so:1
106 much:1
107 lives:1
108 perhaps:1
109 early:1
110 things,whether:1
111 call:1
112 dishonest:1
113 sun:1
114 shun:1
115 melts:1
116 setting:1
117 them.Things:1
118 poverty:1
119 poorest:1
120 mean:1
121 receive:1
122 find:1
123 hourss,even:1
124 thoughts,as:1
125 rich:1
126 poor:1
127 man's:1
128 cheering:1
129 great:1
130 see:1
131 supporting:1
132 themselves:1
133 misgiving.Most:1

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM