使用dfsadmin使用程序執行HDFS操作
作者:尹正傑
版權聲明:原創作品,謝絕轉載!否則將追究法律責任。
一.hdfs dfsadmin概述
可以使用hdfs dfsadmin命令從明朗了和管理HDFS。雖然使用hdfs dfs命令也可以管理HDFS文件和目錄,但dfsadmin命令空間以執行HDFS特定的管理任務。 [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin Usage: hdfs dfsadmin Note: Administrative commands can only be run as the HDFS superuser. [-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]] [-safemode <enter | leave | get | wait>] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>] [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>] [-finalizeUpgrade] [-rollingUpgrade [<query|prepare|finalize>]] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh <host:ipc_port> <key> [arg1..argn] [-reconfig <namenode|datanode> <host:ipc_port> <start|status|properties>] [-printTopology] [-refreshNamenodes datanode_host:ipc_port] [-getVolumeReport datanode_host:ipc_port] [-deleteBlockPool datanode_host:ipc_port blockpoolId [force]] [-setBalancerBandwidth <bandwidth in bytes per second>] [-getBalancerBandwidth <datanode_host:ipc_port>] [-fetchImage <local directory>] [-allowSnapshot <snapshotDir>] [-disallowSnapshot <snapshotDir>] [-shutdownDatanode <datanode_host:ipc_port> [upgrade]] [-evictWriters <datanode_host:ipc_port>] [-getDatanodeInfo <datanode_host:ipc_port>] [-metasave filename] [-triggerBlockReport [-incremental] <datanode_host:ipc_port>] [-listOpenFiles] [-help [cmd]] Generic options supported are: -conf <configuration file> specify an application configuration file -D <property=value> define a value for a given property -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations. -jt <local|resourcemanager:port> specify a ResourceManager -files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster -libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath -archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines The general command line syntax is: command [genericOptions] [commandOptions] [root@hadoop101.yinzhengjie.com ~]#
二.dfsadmin -report命令
使用dfsamin工具可以檢查HDFS集群的狀態。dfsadmin -report命令能夠顯示集群的基本統計信息,包括DataNode和NameNode的狀態,配置的磁盤容量和數據塊的運行狀態等有用的信息。 dfsadmin -report命令顯示集群和各個DataNode級的以下信息(下面是一個使用dfsadmin -report命令的示例): (1)HDFS存儲分配的摘要,包括有關已配置,已用和剩余空間的信息; (2)如果已配置集中式HDFS緩存,則顯示使用和剩余的緩存百分比; (3)缺少,損壞和少於復制因子的塊; [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -report Configured Capacity: 16493959577600 (15.00 TB) #此集群中HDFS的已配置容量 Present Capacity: 16493959577600 (15.00 TB) #此集群中現有的容量 DFS Remaining: 16493167906816 (15.00 TB) #此集群中剩余容量 DFS Used: 791670784 (755.00 MB) #HDFS使用的存儲統計信息 DFS Used%: 0.00% #同上,只不過以百分比顯示而已 Under replicated blocks: 16 #顯示是否由任何未充分復制,損壞或丟失的塊 Blocks with corrupt replicas: 0 #具有損壞副本的塊 Missing blocks: 0 #丟失的塊 Missing blocks (with replication factor 1): 0 #丟失的塊(復制因子為1) Pending deletion blocks: 0 #掛起的刪除塊。 ------------------------------------------------- Live datanodes (2): #顯示集群中由多少個DataNode是活動的並可用,雖然我有3個DN節點,但只有2個是正常工作的,通過NameNode的WebUI查看也是如此,如下圖所示。 Name: 172.200.6.102:50010 (hadoop102.yinzhengjie.com) #DN節點的IP地址及端口號 Hostname: hadoop102.yinzhengjie.com #DN節點的主機名 Rack: /rack001 #該DN節點的機架編號 Decommission Status : Normal #DataNode的退役狀態 Configured Capacity: 8246979788800 (7.50 TB) #DN節點的配置容量 DFS Used: 395841536 (377.50 MB) #DN節點的使用容量 Non DFS Used: 0 (0 B) #未使用的容量 DFS Remaining: 8246583947264 (7.50 TB) #剩余的容量 DFS Used%: 0.00% #DN節點的使用百分比 DFS Remaining%: 100.00% #DN節點的剩余百分比 Configured Cache Capacity: 32000000 (30.52 MB) #緩存使用情況 Cache Used: 319488 (312 KB) Cache Remaining: 31680512 (30.21 MB) Cache Used%: 1.00% Cache Remaining%: 99.00% Xceivers: 2 Last contact: Mon Aug 17 05:08:10 CST 2020 Last Block Report: Mon Aug 17 04:18:40 CST 2020 Name: 172.200.6.103:50010 (hadoop103.yinzhengjie.com) Hostname: hadoop103.yinzhengjie.com Rack: /rack002 Decommission Status : Normal Configured Capacity: 8246979788800 (7.50 TB) DFS Used: 395829248 (377.49 MB) Non DFS Used: 0 (0 B) DFS Remaining: 8246583959552 (7.50 TB) DFS Used%: 0.00% DFS Remaining%: 100.00% Configured Cache Capacity: 32000000 (30.52 MB) Cache Used: 0 (0 B) Cache Remaining: 32000000 (30.52 MB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 2 Last contact: Mon Aug 17 05:08:10 CST 2020 Last Block Report: Mon Aug 17 01:43:05 CST 2020 Dead datanodes (1): Name: 172.200.6.104:50010 (hadoop104.yinzhengjie.com) Hostname: hadoop104.yinzhengjie.com Rack: /rack002 Decommission Status : Normal Configured Capacity: 8246979788800 (7.50 TB) DFS Used: 395776000 (377.44 MB) Non DFS Used: 0 (0 B) DFS Remaining: 8246584012800 (7.50 TB) DFS Used%: 0.00% DFS Remaining%: 100.00% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 0 Last contact: Mon Aug 17 04:02:57 CST 2020 Last Block Report: Mon Aug 17 01:43:05 CST 2020 [root@hadoop101.yinzhengjie.com ~]#
三.dfsadmin -refreshNodes命令
[root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -refreshNodes #用於更新連接到NameNode的DataNode列表。 Refresh nodes successful [root@hadoop101.yinzhengjie.com ~]# 溫馨提示: dfs.hosts 為文件命名,該文件包含允許連接到名稱節點的主機列表。必須指定文件的完整路徑名。如果該值為空,則允許所有主機。
dfs.hosts.exclude: 為文件命名,該文件包含不允許連接到名稱節點的主機列表。必須指定文件的完整路徑名。如果該值為空,則不排除任何主機。
NameNode從dfs.hosts指向的文件和hdfs-site.xml文件中的"dfs.hosts.exclude"配置參數讀取主機名。
dfs.hosts文件列出了運行注冊到NameNode的所有主機。dfs.hosts.exclude文件列出了所有需要停用的DataNode(要停用的節點的所有副本都被復制到其它DataNode之后,即停用)。
四.dfsadmin -metasave命令
dfsadmin -metasave命令提供的信息比dfsadmin -report命令提供的更多。使用此命令可以獲取各種與塊相關的信息。例如: (1)塊總數 (2)DataNode的心跳信息(比如可以看到Live Datanodes和Dead Datanodes) (3)正在等待復制的塊 (4)當前這個在復制的塊 (5)等待刪除的塊等 具體使用方法可參考下面我給的案例。

[root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -help metasave -metasave <filename>: Save Namenode's primary data structures to <filename> in the directory specified by hadoop.log.dir property. <filename> is overwritten if it exists. <filename> will contain one line for each of the following 1. Datanodes heart beating with Namenode 2. Blocks waiting to be replicated 3. Blocks currrently being replicated 4. Blocks waiting to be deleted [root@hadoop101.yinzhengjie.com ~]#

[root@hadoop101.yinzhengjie.com ~]# ll /yinzhengjie/softwares/hadoop/logs/ total 2316 -rw-r--r-- 1 root root 2271719 Aug 17 05:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 17 01:42 hadoop-root-namenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 16 12:09 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 733 Aug 16 11:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.2 -rw-r--r-- 1 root root 733 Aug 14 19:01 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.3 -rw-r--r-- 1 root root 733 Aug 14 02:54 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.4 -rw-r--r-- 1 root root 733 Aug 13 18:40 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.5 -rw-r--r-- 1 root root 64372 Aug 12 15:50 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 12 15:49 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 12 14:57 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 0 Aug 12 14:57 SecurityAuth-root.audit [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -metasave meta.log #使用此命令可以獲取各種與塊相關的信息,指定的文件會默認保存在Hadoop的安裝目錄的logs目錄下。 Created metasave file meta.log in the log directory of namenode hdfs://hadoop101.yinzhengjie.com:9000 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll /yinzhengjie/softwares/hadoop/logs/ total 2320 -rw-r--r-- 1 root root 2271719 Aug 17 05:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 17 01:42 hadoop-root-namenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 16 12:09 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 733 Aug 16 11:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.2 -rw-r--r-- 1 root root 733 Aug 14 19:01 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.3 -rw-r--r-- 1 root root 733 Aug 14 02:54 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.4 -rw-r--r-- 1 root root 733 Aug 13 18:40 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.5 -rw-r--r-- 1 root root 64372 Aug 12 15:50 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 12 15:49 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 12 14:57 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 3500 Aug 17 05:57 meta.log #該文件名稱就是我們上面指定的,我們可以使用文件編輯工具來查看內容。 -rw-r--r-- 1 root root 0 Aug 12 14:57 SecurityAuth-root.audit [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#

[root@hadoop101.yinzhengjie.com ~]# cat /yinzhengjie/softwares/hadoop/logs/meta.log #查看咱們保存的元數據信息 49 files and directories, 27 blocks = 76 total Live Datanodes: 2 Dead Datanodes: 1 Metasave: Blocks waiting for reconstruction: 16 /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Debuginfo.repo: blk_1073741855_1031 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hadoop-2.10.0.tar.gz: blk_1073741850_1026 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200814193733/fstab: blk_1073741835_1011 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/epel-testing.repo: blk_1073741860_1036 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hostname: blk_1073741851_1027 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200814193733/sysctl.conf: blk_1073741836_1012 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/epel.repo: blk_1073741861_1037 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Media.repo: blk_1073741856_1032 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-CR.repo: blk_1073741854_1030 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-fasttrack.repo: blk_1073741859_1035 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hosts2020: blk_1073741862_1038 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/wc.txt.gz: blk_1073741848_1024 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/wc.txt.gz: blk_1073741852_1028 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Sources.repo: blk_1073741857_1033 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Base.repo: blk_1073741853_1029 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Vault.repo: blk_1073741858_1034 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : Metasave: Blocks currently missing: 0 Mis-replicated blocks that have been postponed: Metasave: Blocks being replicated: 0 Metasave: Blocks 0 waiting deletion from 0 datanodes. Corrupt Blocks: Metasave: Number of datanodes: 3 172.200.6.104:50010 /rack002 IN 8246979788800(7.50 TB) 395776000(377.44 MB) 0.00% 8246584012800(7.50 TB) 0(0 B) 0(0 B) 100.00% 0(0 B) Mon Aug 17 04:02:57 CST 2020 172.200.6.102:50010 /rack001 IN 8246979788800(7.50 TB) 395841536(377.50 MB) 0.00% 8246583947264(7.50 TB) 32000000(30.52 MB) 319488(312 KB) 1.00% 31680512(30.21 MB) Mon Aug 17 05:57:29 CST 2020 172.200.6.103:50010 /rack002 IN 8246979788800(7.50 TB) 395829248(377.49 MB) 0.00% 8246583959552(7.50 TB) 32000000(30.52 MB) 0(0 B) 0.00% 32000000(30.52 MB) Mon Aug 17 05:57:29 CST 2020 [root@hadoop101.yinzhengjie.com ~]#
五.管理HDFS的空間配額
博主推薦閱讀: https://www.cnblogs.com/yinzhengjie2020/p/13334148.html
六.
七.
八.
九.
十.