利用hadoop自帶程序運行wordcount


1.啟動hadoop守護進程

   bin/start-all.sh

2.在hadoop的bin目錄下建立一個input文件夾

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ mkdir input

3.進入input目錄之后,在input目錄下新建兩個文本文件,並想其寫入內容

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ cd input
JIAS-MacBook-Pro:input jia$ echo "hello excuse me fine thank you">text1.txt
JIAS-MacBook-Pro:input jia$ echo "hello how do you do thank you">text2.txt

4.進入hadoop的bin目錄,輸入jps命令,確認hadoop已經跑起來了

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ cd bin
JIAS-MacBook-Pro:bin jia$ jps
656 SecondaryNameNode
517 NameNode
709 JobTracker
777 TaskTracker
587 DataNode
797 Jps

5.把input文件上傳到hdfs上

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -put input in

6.查看hdfs上的項目

 

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls ./in/*
-rw-r--r--   1 jia supergroup         31 2014-07-17 20:39 /user/jia/in/text1.txt
-rw-r--r--   1 jia supergroup         30 2014-07-17 20:39 /user/jia/in/text2.txt

7.利用自帶的wordcount執行,並把結果放在output文件夾上

 

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount in output
14/07/17 20:46:56 INFO input.FileInputFormat: Total input paths to process : 2
14/07/17 20:46:56 INFO mapred.JobClient: Running job: job_201407172036_0001
14/07/17 20:46:57 INFO mapred.JobClient:  map 0% reduce 0%
14/07/17 20:47:04 INFO mapred.JobClient:  map 100% reduce 0%
14/07/17 20:47:16 INFO mapred.JobClient:  map 100% reduce 100%
14/07/17 20:47:18 INFO mapred.JobClient: Job complete: job_201407172036_0001
14/07/17 20:47:18 INFO mapred.JobClient: Counters: 17
14/07/17 20:47:18 INFO mapred.JobClient:   Map-Reduce Framework
14/07/17 20:47:18 INFO mapred.JobClient:     Combine output records=11
14/07/17 20:47:18 INFO mapred.JobClient:     Spilled Records=22
14/07/17 20:47:18 INFO mapred.JobClient:     Reduce input records=11
14/07/17 20:47:18 INFO mapred.JobClient:     Reduce output records=8
14/07/17 20:47:18 INFO mapred.JobClient:     Map input records=2
14/07/17 20:47:18 INFO mapred.JobClient:     Map output records=13
14/07/17 20:47:18 INFO mapred.JobClient:     Map output bytes=113
14/07/17 20:47:18 INFO mapred.JobClient:     Reduce shuffle bytes=73
14/07/17 20:47:18 INFO mapred.JobClient:     Combine input records=13
14/07/17 20:47:18 INFO mapred.JobClient:     Reduce input groups=8
14/07/17 20:47:18 INFO mapred.JobClient:   FileSystemCounters
14/07/17 20:47:18 INFO mapred.JobClient:     HDFS_BYTES_READ=61
14/07/17 20:47:18 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=322
14/07/17 20:47:18 INFO mapred.JobClient:     FILE_BYTES_READ=126
14/07/17 20:47:18 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=54
14/07/17 20:47:18 INFO mapred.JobClient:   Job Counters 
14/07/17 20:47:18 INFO mapred.JobClient:     Launched map tasks=2
14/07/17 20:47:18 INFO mapred.JobClient:     Launched reduce tasks=1
14/07/17 20:47:18 INFO mapred.JobClient:     Data-local map tasks=2
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ 

 

8.查看結果

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls
Found 2 items
drwxr-xr-x   - jia supergroup          0 2014-07-17 20:39 /user/jia/in
drwxr-xr-x   - jia supergroup          0 2014-07-17 20:47 /user/jia/output
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls ./output
Found 2 items
drwxr-xr-x   - jia supergroup          0 2014-07-17 20:46 /user/jia/output/_logs
-rw-r--r--   1 jia supergroup         54 2014-07-17 20:47 /user/jia/output/part-r-00000
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -cat ./output/*
do    2
excuse    1
fine    1
hello    2
how    1
me    1
thank    2
you    3
cat: Source must be a file.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM