hadoop之hdfs命令詳解

本文轉載自查看原文 2019-09-24 20:15 5564 hadoop

本篇主要對hadoop命令和hdfs命令進行闡述，yarn命令會在之后的文章中體現

hadoop fs命令可以用於其他文件系統，不止是hdfs文件系統內，也就是說該命令的使用范圍更廣可以用於HDFS、Local FS等不同的文件系統。而hdfs dfs命令只用於HDFS文件系統；

一、hadoop命令

使用語法：hadoop [--config confdir] COMMAND #其中config用來覆蓋默認的配置

##command #子命令
fs                   run a generic filesystem user client
version              print the version
jar <jar>            run a jar file
checknative [-a|-h]  check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath            prints the class path needed to get the
credential           interact with credential providers Hadoop jar and the required libraries
daemonlog            get/set the log level for each daemon
s3guard              manage data on S3
trace                view and modify Hadoop tracing settings

1、archive

創建一個hadoop壓縮文件，詳細的可以參考 http://hadoop.apache.org/docs/r2.7.0/hadoop-archives/HadoopArchives.html

使用格式：hadoop archive -archiveName NAME -p <parent path> <src>* <dest> #-p 可以同時指定多個路徑

實例：

[hive@mwpl003 ~]$ hadoop fs -touchz /tmp/test/a.txt
[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/
Found 1 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:50 /tmp/test/a.txt
[hive@mwpl003 ~]$ hadoop archive -archiveName test.har -p  /tmp/test/a.txt -r 3 /tmp/test
19/09/18 13:52:58 INFO mapreduce.JobSubmitter: number of splits:1
19/09/18 13:52:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1565571819971_6988
19/09/18 13:52:58 INFO impl.YarnClientImpl: Submitted application application_1565571819971_6988
19/09/18 13:52:58 INFO mapreduce.Job: The url to track the job: http://ip_address:8088/proxy/application_1565571819971_6988/
19/09/18 13:52:58 INFO mapreduce.Job: Running job: job_1565571819971_6988
19/09/18 13:53:04 INFO mapreduce.Job: Job job_1565571819971_6988 running in uber mode : false
19/09/18 13:53:04 INFO mapreduce.Job:  map 0% reduce 0%
19/09/18 13:53:08 INFO mapreduce.Job:  map 100% reduce 0%
19/09/18 13:53:13 INFO mapreduce.Job:  map 100% reduce 100%
19/09/18 13:53:13 INFO mapreduce.Job: Job job_1565571819971_6988 completed successfully
19/09/18 13:53:13 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=80
                FILE: Number of bytes written=313823
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=264
                HDFS: Number of bytes written=69
                HDFS: Number of read operations=14
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters 
                Launched map tasks=1
                Launched reduce tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=7977
                Total time spent by all reduces in occupied slots (ms)=12015
                Total time spent by all map tasks (ms)=2659
                Total time spent by all reduce tasks (ms)=2403
                Total vcore-milliseconds taken by all map tasks=2659
                Total vcore-milliseconds taken by all reduce tasks=2403
                Total megabyte-milliseconds taken by all map tasks=8168448
                Total megabyte-milliseconds taken by all reduce tasks=12303360
        Map-Reduce Framework
                Map input records=1
                Map output records=1
                Map output bytes=59
                Map output materialized bytes=76
                Input split bytes=97
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=76
                Reduce input records=1
                Reduce output records=0
                Spilled Records=2
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=91
                CPU time spent (ms)=2320
                Physical memory (bytes) snapshot=1189855232
                Virtual memory (bytes) snapshot=11135381504
                Total committed heap usage (bytes)=3043491840
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=167
        File Output Format Counters 
                Bytes Written=0
[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/
Found 2 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:50 /tmp/test/a.txt
drwxr-xr-x   - hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har

[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/test.har/
Found 4 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har/_SUCCESS
-rw-r--r--   3 hive supergroup         55 2019-09-18 13:53 /tmp/test/test.har/_index
-rw-r--r--   3 hive supergroup         14 2019-09-18 13:53 /tmp/test/test.har/_masterindex
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har/part-0

解壓：
hadoop distcp har:///tmp/test/test.har /tmp/test1
hdfs dfs -cp har:///tmp/test/test.har /tmp/test1

2、checknative

檢查hadoop的原生代碼，一般人用不到

使用語法：hadoop checknative [-a] [-h]
-a 檢查所有的庫
-h 顯示幫助

3、classpath

打印hadoop jar或者庫的類路徑

使用語法：hadoop classpath [--glob |--jar <path> |-h |--help]

4、credential

管理憑證供應商的憑證、密碼和secret(有關秘密信息）

使用語法：hadoop credential <subcommand> [options]

5、distcp（比較常用）

distributed copy的縮寫（望文生義),主要用於集群內/集群之間復制文件。需要使用到mapreduce

使用語法：hadoop distcp [-option] hdfs://source hdfs://dest
詳細見：http://hadoop.apache.org/docs/r2.7.0/hadoop-distcp/DistCp.html

常用的幾個選項：
-m <num_maps>  #指定了拷貝數據時map的數目。請注意並不是map數越多吞吐量越大
-i               #忽略失敗
-log <logdir>  #記錄日志到 <logdir>
-update        #當目標集群上的文件不存在或文件不一致時，才會從源集群拷貝
-overwrite     #覆蓋目標集群上的文件
-filter        #過濾不需要復制的文件
-delete        #刪除目標文件存在，但不存在source中的文件

6、fs

與hdfs dfs同用

查看幫助：hadoop fs -help

詳細查看：http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/FileSystemShell.html

包括如下一些子命令：

appendToFile, cat, checksum, chgrp, chmod, chown, copyFromLocal, copyToLocal, count, cp, createSnapshot, deleteSnapshot, df, du, expunge, find, get, getfacl, getfattr, getmerge, help, ls, mkdir, moveFromLocal, moveToLocal, mv, put, renameSnapshot, rm, rmdir, setfacl, setfattr, setrep, stat, tail, test, text, touchz

在這里我想各位都應該比較熟悉linux的基本操作命令了，所以這些命令用起來比較簡單

6.1、appendToFile

appendToFile  #追加一下本地文件到分布式文件系統
Usage: hadoop fs -appendToFile <localsrc> ... <dst>
example：
hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile  #表示從標准輸入輸入數據到hadoopfile中，ctrl+d 結束輸入

6.2、cat

cat   #查看文件內容
Usage: hadoop fs -cat URI [URI ...]
example：
hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hadoop fs -cat file:///file3 /user/hadoop/file4

6.3、checksum

checksum  #返回被檢查文件的格式
Usage: hadoop fs -checksum URI
example：
[hive@mwpl003 ~]$  hadoop fs -checksum /tmp/test/test.txt
/tmp/test/test.txt      MD5-of-0MD5-of-512CRC32C        000002000000000000000000fde199c1517b7b26b0565ff6b0f46acc

6.4、chgrp

chgrp   #變更文件目錄的所屬組
Usage: hadoop fs -chgrp [-R] GROUP URI [URI ...]

6.5、chmod

chmod  #修改文件或者目錄的權限
Usage: hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]

6.6、chown

chown  #修改目錄或者文件的擁有者和所屬組
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

6.7、copyFromLocal

copyFromLocal #從本地復制文件或者文件夾到hdfs，類似put命令
Usage: hadoop fs -copyFromLocal [-f] <localsrc> URI  #其中-f選項會覆蓋與原文件一樣的目標路徑文件
example：
hadoop fs -copyFromLocal start-hadoop.sh  /tmp

6.8、copyToLocal

copyToLocal  #類似get命令，從hdfs獲取文件到本地
Usage: hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>

6.9、count

count  #計算 目錄，文件，字節數
Usage: hadoop fs -count [-q] [-h] [-v] <paths>

6.10、cp

cp     #復制源文件到目標文件
Usage: hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest>
Example:
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

6.11、Snapshot相關

createSnapshot #創建快照
deleteSnapshot #刪除快照
詳細見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html
HDFS快照是文件系統的只讀時間點副本。可以在文件系統的子樹或整個文件系統上拍攝快照。快照的一些常見用例是數據備份，防止用戶錯誤和災難恢復。
在創建快照前，要設置一個目錄為snapshottable（需要管理員權限），表示可以在該目錄中創建快照
hdfs dfsadmin -allowSnapshot <path> #在path中啟用快照
hdfs dfsadmin -disallowSnapshot <path> #在path中禁止快照
hdfs dfs -ls /foo/.snapshot #列出快照目錄下的所有快照
hdfs dfs -createSnapshot <path> [<snapshotName>] #創建快照，快照名默認為時間戳格式
hdfs dfs -deleteSnapshot <path> <snapshotName> #刪除快照
hdfs dfs -renameSnapshot <path> <oldName> <newName> #快照重命名
hdfs lsSnapshottableDir #獲取快照目錄

6.12、df

df  #展示空間使用情況
Usage: hadoop fs -df [-h] URI [URI ...]

6.13、du

du  #展示目錄包含的文件的大小
Usage: hadoop fs -du [-s] [-h] URI [URI ...]
Example:
hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1

6.14、expunge

expunge  #清空回收站（不要瞎用）
Usage: hadoop fs -expunge

6.15、find

find   #查找
Usage: hadoop fs -find <path> ... <expression> ...
-name pattern
-iname pattern #忽略大小寫
-print
-print0Always
Example:
hadoop fs -find / -name test -print

6.16、get

get #獲取數據，類似於copyToLocal.但有crc校驗
Usage: hadoop fs -get [-ignorecrc] [-crc] <src> <localdst>
Example:
hadoop fs -get /tmp/input/hadoop/*.xml /home/hadoop/testdir/

6.17、getfacl

getfacl #展示目錄或者文件的ACL權限
Usage: hadoop fs -getfacl [-R] <path>
[hive@mwpl003 ~]$ hadoop fs -getfacl -R  /tmp/test
# file: /tmp/test
# owner: hive
# group: supergroup
getfacl: The ACL operation has been rejected.  Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false.

6.18、getfattr

getfattr #顯示文件或目錄的擴展屬性名稱和值
Usage: hadoop fs -getfattr [-R] -n name | -d [-e en] <path>
-n name和 -d是互斥的，
-d表示獲取所有屬性。
-R表示循環獲取； 
-e en 表示對獲取的內容編碼，en的可以取值是 “text”, “hex”, and “base64”.
Examples:
hadoop fs -getfattr -d /file
hadoop fs -getfattr -R -n user.myAttr /dir

6.19、getmerge

getmerge  #合並文件
Usage: hadoop fs -getmerge <src> <localdst> [addnl]
hadoop fs -getmerge   /src  /opt/output.txt
hadoop fs -getmerge  /src/file1.txt /src/file2.txt  /output.txt

6.20、ls

ls   #羅列文件
Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>

6.21、mkdir

mkdir #創建文件夾
Usage: hadoop fs -mkdir [-p] <paths>
Example:
hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir

6.22、moveFromLocal

moveFromLocal #把本地文件移動到hdfs上
Usage: hadoop fs -moveFromLocal <localsrc> <dst>

6.23、moveToLocal

moveToLocal   #把hdfs文件移動到本地上
Usage: hadoop fs -moveToLocal [-crc] <src> <dst>

6.24、mv

mv   #移動文件，但是可以一次移動多個
Usage: hadoop fs -mv URI [URI ...] <dest>
Example:
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1

6.25、put

put  #把文件復制到hdfs上
Usage: hadoop fs -put <localsrc> ... <dst>
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile  #Reads the input from stdin.

6.26、rm

rm  #刪除文件
Usage: hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]

6.27、rmdir

rmdir  #刪除一個目錄
Usage: hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]

6.28、setfacl

setfacl  #設置ACL權限
Usage: hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]
-b 刪除除基本acl項之外的所有項。保留用戶、組和其他用戶
-k 刪除所有的默認ACL權限
-R 遞歸操作
-m 修改ACL權限，保留舊的，添加新的
-x 刪除指定ACL權限
--set 完全替換現有的ACL權限
Examples:
hadoop fs -setfacl -m user:hadoop:rw- /file
hadoop fs -setfacl -x user:hadoop /file
hadoop fs -setfacl -b /file
hadoop fs -setfacl -k /dir
hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file
hadoop fs -setfacl -R -m user:hadoop:r-x /dir
hadoop fs -setfacl -m default:user:hadoop:r-x /dir

6.29、setfattr

setfattr  #設置額外的屬性
Usage: hadoop fs -setfattr -n name [-v value] | -x name <path>
-b 刪除除基本acl項之外的所有項。保留用戶、組和其他用戶
-n 額外屬性名
-v 額外屬性值
-x name 刪除額外屬性
Examples:
hadoop fs -setfattr -n user.myAttr -v myValue /file
hadoop fs -setfattr -n user.noValue /file
hadoop fs -setfattr -x user.myAttr /file

6.30、setrep

setrep  #改變文件的復制因子（復本）
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Example:
hadoop fs -setrep -w 3 /user/hadoop/dir1

6.31、stat

stat #獲取文件的時間
Usage: hadoop fs -stat [format] <path> ...
Example:
hadoop fs -stat "%F %u:%g %b %y %n" /file

6.32、tail

tail #展示文件到標准輸出
Usage: hadoop fs -tail [-f] URI

6.33、test

test  #測試
Usage: hadoop fs -test -[defsz] URI
-d 判斷是否是目錄
-e 判斷是否存在
-f 判斷是否是文件
-s 判斷目錄是否為空
-z 判斷文件是否為空
Example:
hadoop fs -test -e filename

6.34、text

text #可以用來看壓縮文件
Usage: hadoop fs -text <src>

6.35、touchz

touchz  #創建一個空文件
Usage: hadoop fs -touchz URI [URI ...]

7、jar

jar  #運行一個jar文件
Usage: hadoop jar <jar> [mainClass] args...
Example:
hadoop jar ./test/wordcount/wordcount.jar org.codetree.hadoop.v1.WordCount /test/chqz/input /test/chqz/output的各段的含義：
(1) hadoop：${HADOOP_HOME}/bin下的shell腳本名。
(2) jar：hadoop腳本需要的command參數。
(3) ./test/wordcount/wordcount.jar：要執行的jar包在本地文件系統中的完整路徑，參遞給RunJar類。
(4) org.codetree.hadoop.v1.WordCount：main方法所在的類，參遞給RunJar類。
(5) /test/chqz/input：傳遞給WordCount類，作為DFS文件系統的路徑，指示輸入數據來源。
(6) /test/chqz/output：傳遞給WordCount類，作為DFS文件系統的路徑，指示輸出數據路徑。
hadoop推薦使用yarn jar替代hadoop jar 詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar

8、key

key #用來管理秘鑰，基本不用

9、trace

trace  #查看和修改跟蹤設置
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/Tracing.html

二、hdfs命令

hdfs命令有如下選項：

User Commands： classpath, dfs, fetchdt, fsck, getconf, groups, lsSnapshottableDir, jmxget, oev, oiv, oiv_legacy, snapshotDiff, version,
Administration Commands： balancer, cacheadmin, crypto, datanode, dfsadmin, haadmin, journalnode, mover, namenode, nfs3, portmap, secondarynamenode, storagepolicies, zkfc
Debug Commands： verifyMeta, computeMeta, recoverLease

這里不全詳解，詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html

1、classpath

classpath  #獲取jar包或者庫的有關類路徑
Usage: hdfs classpath [--glob |--jar <path> |-h |--help]

2、dfs

dfs #同上節hadoop fs 命令

3、fetchdt

fetchdt  #從namenode節點獲取代理令牌
Usage: hdfs fetchdt <opts> <token_file_path>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#fetchdt

4、fsck（重要）

hdfs fsck <path>
          [-list-corruptfileblocks |
          [-move | -delete | -openforwrite]
          [-files [-blocks [-locations | -racks | -replicaDetails]]]
          [-includeSnapshots]
          [-storagepolicies] [-blockId <blk_Id>]

-delete    刪除損壞的文件
-files    打印正在檢查的文件.
-files -blocks    打印塊報告
-files -blocks -locations    Print out locations for every block.
-files -blocks -racks    打印每個塊的位置
-files -blocks -replicaDetails    打印出每個副本的詳細信息.
-includeSnapshots    如果給定路徑指示SnapshotTable目錄或其下有SnapshotTable目錄，則包括快照數據
-list-corruptfileblocks    打印出所屬丟失塊和文件的列表.
-move    將損壞的文件移動到/lost+found.
-openforwrite    打印為寫入而打開的文件.
-storagepolicies    打印塊的存儲策略摘要.
-blockId    打印出有關塊的信息.

5、getconf（重要）

hdfs getconf -namenodes #獲取namenode節點
hdfs getconf -secondaryNameNodes #獲取secondaryNameNodes節點
hdfs getconf -backupNodes  #獲取群集中備份節點的列表
hdfs getconf -includeFile  #獲取定義可以加入群集的數據節點的包含文件路徑
hdfs getconf -excludeFile  #獲取定義需要停用的數據節點的排除文件路徑
hdfs getconf -nnRpcAddresses #獲取namenode rpc地址
hdfs getconf -confKey [key] #從配置中獲取特定密鑰 ，可以用來返回hadoop的配置信息的具體值

6、groups

groups #返回用戶的所屬組
Usage: hdfs groups [username ...]

7、lsSnapshottableDir

lsSnapshottableDir #查看快照目錄
Usage: hdfs lsSnapshottableDir [-help]

8、jmxget

jmxget  #從特定服務獲取jmx信息
Usage: hdfs jmxget [-localVM ConnectorURL | -port port | -server mbeanserver | -service service]

9、oev

oev  #離線編輯查看器
Usage: hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE

10、oiv

oiv  #離線映像編輯查看器
Usage: hdfs oiv [OPTIONS] -i INPUT_FILE

11、snapshotDiff

snapshotDiff  #對比快照信息的不同
Usage: hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report

12、balancer（重要）

balancer
 hdfs balancer
          [-threshold <threshold>]
          [-policy <policy>]
          [-exclude [-f <hosts-file> | <comma-separated list of hosts>]]
          [-include [-f <hosts-file> | <comma-separated list of hosts>]]
          [-source [-f <hosts-file> | <comma-separated list of hosts>]]
          [-blockpools <comma-separated list of blockpool ids>]
          [-idleiterations <idleiterations>]
-policy <policy>    datanode (default): 如果每個數據節點都是平衡的，則群集是平衡的.
blockpool: 如果每個數據節點中的每個塊池都是平衡的，則群集是平衡的.
-threshold <threshold>    磁盤容量的百分比。這將覆蓋默認閾值
-exclude -f <hosts-file> | <comma-separated list of hosts>    排除平衡器正在平衡的指定數據節點
-include -f <hosts-file> | <comma-separated list of hosts>    僅包含要由平衡器平衡的指定數據節點
-source -f <hosts-file> | <comma-separated list of hosts>    僅選取指定的數據節點作為源節點。
-blockpools <comma-separated list of blockpool ids>    平衡器將僅在此列表中包含的塊池上運行.
-idleiterations <iterations>    退出前的最大空閑迭代次數。這將覆蓋默認的空閑操作（5次）

13、cacheadmin

cacheadmin
Usage: hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]
hdfs crypto -createZone -keyName <keyName> -path <path>
  hdfs crypto -listZones
  hdfs crypto -provisionTrash -path <path>
  hdfs crypto -help <command-name>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html

14、datanode

datanode #運行datanode
Usage: hdfs datanode [-regular | -rollback | -rollingupgrade rollback]
-regular    正常啟動(default).
-rollback    將datanode回滾到以前的版本。這應該在停止datanode並分發舊的hadoop版本之后使用
-rollingupgrade rollback    回滾滾動升級操作

15、dfsadmim（重要）

hdfs dfsadmin [GENERIC_OPTIONS]
          [-report [-live] [-dead] [-decommissioning]]   #報告基本的文件系統信息和統計信息，包括測量所有dns上的復制、校驗和、快照等使用的原始空間。
          [-safemode enter | leave | get | wait | forceExit] #安全模式維護命令
           #安全模式在namenode啟動時自動進入，當配置的最小塊百分比滿足最小復制條件時自動離開安全模式。如果namenode檢測到任何異常，
           #則它將在安全模式下逗留，直到該問題得到解決。如果異常是故意操作的結果，那么管理員可以使用-safemode forceExit退出安全模式
          [-saveNamespace] #將當前命名空間保存到存儲目錄並重置編輯日志。需要安全模式
          [-rollEdits] #在活動的namenode上滾動編輯日志
          [-restoreFailedStorage true |false |check] #此選項將打開或者關閉自動嘗試還原失敗的存儲副本。如果失敗的存儲再次可用，
          #系統將在檢查點期間嘗試還原編輯和fsimage。“check”選項將返回當前設置
          [-refreshNodes] #重新讀取主機並排除文件，以更新允許連接到namenode的數據節點集，以及應解除或重新啟用的數據節點集
          [-setQuota <quota> <dirname>...<dirname>]
          [-clrQuota <dirname>...<dirname>]
          [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
          [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
          [-finalizeUpgrade] #完成hdfs的升級。datanodes刪除它們以前版本的工作目錄，然后namenode執行相同的操作。這就完成了升級過程
          [-rollingUpgrade [<query> |<prepare> |<finalize>]]
          [-metasave filename] #將namenode的主數據結構保存到hadoop.log.dir屬性指定的目錄中的filename。如果文件名存在，它將被覆蓋。
          #該文件包含帶namenode的datanodes心跳，等待復制的塊，當前正在復制的塊，等待刪除的塊
          [-refreshServiceAcl] #重新加載服務級別授權策略文件
          [-refreshUserToGroupsMappings] #刷新用戶到組的映射
          [-refreshSuperUserGroupsConfiguration] #刷新超級用戶代理組映射
          [-refreshCallQueue] #從配置重新加載調用隊列
          [-refresh <host:ipc_port> <key> [arg1..argn]] #觸發由<host:ipc port>上的<key>指定的資源的運行時刷新。之后的所有其他參數都將發送到主機
          [-reconfig <datanode |...> <host:ipc_port> <start |status>] #開始重新配置或獲取正在進行的重新配置的狀態。第二個參數指定節點類型。目前，只支持重新加載datanode的配置
          [-printTopology] #打印由namenode報告的機架及其節點的樹
          [-refreshNamenodes datanodehost:port] #對於給定的數據節點，重新加載配置文件，停止為已刪除的塊池提供服務，並開始為新的塊池提供服務
          [-deleteBlockPool datanode-host:port blockpoolId [force]] #如果傳遞了force，則將刪除給定數據節點上給定block pool id的塊池目錄及其內容，否則僅當該目錄為空時才刪除該目錄。
          #如果datanode仍在為塊池提供服務，則該命令將失敗
          [-setBalancerBandwidth <bandwidth in bytes per second>] #更改HDFS塊平衡期間每個數據節點使用的網絡帶寬。<bandwidth>是每個數據節點每秒將使用的最大字節數。
          #此值重寫dfs.balance.bandwidthpersec參數。注意：新值在datanode上不是持久的
          [-getBalancerBandwidth <datanode_host:ipc_port>] #獲取給定數據節點的網絡帶寬（字節/秒）。這是數據節點在hdfs塊平衡期間使用的最大網絡帶寬
          [-allowSnapshot <snapshotDir>] #設置快照目錄
          [-disallowSnapshot <snapshotDir>] #禁止快照
          [-fetchImage <local directory>] #從namenode下載最新的fsimage並將其保存在指定的本地目錄中
          [-shutdownDatanode <datanode_host:ipc_port> [upgrade]] #提交給定數據節點的關閉請求
          [-getDatanodeInfo <datanode_host:ipc_port>] #獲取有關給定數據節點的信息
          [-evictWriters <datanode_host:ipc_port>]  #使datanode收回正在寫入塊的所有客戶端。如果由於編寫速度慢而掛起退役，這將非常有用
          [-triggerBlockReport [-incremental] <datanode_host:ipc_port>] #觸發給定數據節點的塊報告。如果指定了“增量”，則為“增量”，否則為完整的塊報告
          [-help [cmd]]

16、haadmin（重要）

hdfs haadmin -checkHealth <serviceId>  #檢查給定namenode的運行狀況
hdfs haadmin -failover [--forcefence] [--forceactive] <serviceId> <serviceId> #在兩個namenodes之間啟動故障轉移
hdfs haadmin -getServiceState <serviceId> #確定給定的namenode是活動的還是備用的
hdfs haadmin -help <command>
hdfs haadmin -transitionToActive <serviceId> [--forceactive] #將給定namenode的狀態轉換為active
hdfs haadmin -transitionToStandby <serviceId> #將給定namenode的狀態轉換為standby
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html

17、journalnode

journalnode #為通過QJM實現的高可用hdfs啟動journalnode
Usage: hdfs journalnode

18、mover　　

Usage: hdfs mover [-p <files/dirs> | -f <local file name>]
-f 指定包含要遷移的hdfs文件/目錄列表的本地文件
-p 指定要遷移的hdfs文件/目錄的空間分隔列表
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

19、namenode

namenode
hdfs namenode [-backup] |  #開始備份節點
         [-checkpoint] | #檢查點開始節點
         [-format [-clusterid cid ] [-force] [-nonInteractive] ] |  #格式化指定的NameNode。 它啟動NameNode，
         #對其進行格式化然后將其關閉。 如果名稱目錄存在，則為-force選項格式。 如果名稱目錄存在，則-nonInteractive選項將中止，除非指定了-force選項
         [-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] | #在分發新的Hadoop版本后，應該使用升級選項啟動Namenode
         [-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] | #升級指定的NameNode然后關閉它
         [-rollback] | #將NameNode回滾到以前的版本。 應在停止群集並分發舊Hadoop版本后使用此方法
         [-rollingUpgrade <rollback |started> ] |#滾動升級 詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
         [-finalize] |  #不再支持。使用dfsadmin -finalizeUpgrade替換
         [-importCheckpoint] | #從檢查點目錄加載image並將其保存到當前目錄中。 從屬性dfs.namenode.checkpoint.dir讀取檢查點目錄
         [-initializeSharedEdits] | #格式化新的共享編輯目錄並復制足夠的編輯日志段，以便備用NameNode可以啟動
         [-bootstrapStandby [-force] [-nonInteractive] [-skipSharedEditsCheck] ] | #允許通過從活動NameNode復制最新的命名空間快照來引導備用NameNode的存儲目錄
         [-recover [-force] ] | #在損壞的文件系統上恢復丟失的元數據
         [-metadataVersion ] #驗證配置的目錄是否存在，然后打印軟件和映像的元數據版本

20、secondarynamenode

Usage: hdfs secondarynamenode [-checkpoint [force]] | [-format] | [-geteditsize]
-checkpoint [force]    如果EditLog size> = fs.checkpoint.size，則檢查SecondaryNameNode。 如果使用force，則檢查點與EditLog大小無關
-format    啟動期間格式化本地存儲
-geteditsize    打印NameNode上未取消選中的事務的數量

21、storagepolicies

storagepolicies #列出所有存儲策略
Usage: hdfs storagepolicies
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

22、zkfc

Usage: hdfs zkfc [-formatZK [-force] [-nonInteractive]]
-formatZK    格式化Zookeeper實例
-force: 如果znode存在，則格式化znode。 
-nonInteractive：如果znode存在，則格式化znode中止，除非指定了-force選項
-h    Display help

23、verifyMeta

verifyMeta  #驗證HDFS元數據和塊文件。 如果指定了塊文件，我們將驗證元數據文件中的校驗和是否與塊文件匹配
Usage: hdfs debug verifyMeta -meta <metadata-file> [-block <block-file>]
-block block-file    用於指定數據節點的本地文件系統上的塊文件的絕對路徑
-meta metadata-file    數據節點的本地文件系統上的元數據文件的絕對路徑

24、computeMeta

computeMeta #從塊文件計算HDFS元數據。 如果指定了塊文件，我們將從塊文件計算校驗和，並將其保存到指定的輸出元數據文件中
Usage: hdfs debug computeMeta -block <block-file> -out <output-metadata-file>
-block block-file    數據節點的本地文件系統上的塊文件的絕對路徑
-out output-metadata-file    輸出元數據文件的絕對路徑，用於存儲塊文件的校驗和計算結果。

25、recoverLease

recoverLease #恢復指定路徑上的租約。 該路徑必須駐留在HDFS文件系統上。 默認重試次數為1
Usage: hdfs debug recoverLease -path <path> [-retries <num-retries>]
[-path path]    要恢復租約的HDFS路徑
[-retries num-retries]    客戶端重試調用recoverLease的次數。 默認重試次數為1

更多hadoop生態文章請見：hadoop生態系列

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 hadoop hdfs uri詳解 Hadoop詳解(04)-Hdfs hadoop-HDFS-HA的詳解 Hadoop（七）HDFS容錯機制詳解 Hadoop（三）HDFS讀寫原理與shell命令 Hadoop之HDFS的Shell腳本命令總結 hadoop HDFS常用文件操作命令 hadoop HDFS常用命令 hadoop集群之HDFS和YARN啟動和停止命令 kettle連接hadoop&hdfs圖文詳解