使用Hadoop WebHDFS訪問HDFS
作者:尹正傑
版權聲明:原創作品,謝絕轉載!否則將追究法律責任。
webHDFS和HttpFS都是Hadoop的HTTP/HTTPS REST接口。這兩個接口使我們能夠讀取HDFS數據並寫入,以及執行與HDFS相關的幾個管理命令。可以將它們嵌入程序,腳本或通過命令行工具(如curl或wget)來使用這兩個接口。
WebHDFS不支持高可用NameNode架構,但HttpFS支持。
一.WebHDFS概述
當在Hadoop集群中運行的應用程序想要訪問HDFS數據時,它們使用Hadoop的本地客戶端在HDFS上工作。但是,可能需要從集群外部訪問HDFS,以便處理,存儲和檢索HDFS數據。 如果應用程序需要使用本機HDFS協議,則必須在運行應用程序的服務器上安裝Hadoop,並且要提供與應用程序的Java依賴。 Hadoop的WebHDFS提供了一組強大的HTTP REST API。REST是一種用於構建大規模Web服務的架構風格,其允許應用程序遠程訪問和使用HDFS。除了便於從外部訪問HDFS之外,當嘗試使用兩個Hadoop(每個都運行不同版本的Hadoop)集群時,WebHDFS也很有用。 由於WebHDFS和MapReduce,HDFS版本無關,因為它使用REST API,所以它可以在兩個集群中使用。例如,當需要使用DistCp實用程序在兩個集群之間執行數據復制時,可以使用它。 當使用WebHDFS遠程訪問HDFS數據時,不需要在客戶端上安裝Hadoop。可以使用curl和wget等知名工具來訪問HDFS數據。WebHDFS支持直接連接到Hadoop集群執行所有HDFS操作。 WebHDFS使用基本的HTTP操作,如GET,PUT,POST和DELETE來遠程操作HDFS文件系統。 博主推薦閱讀: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html 溫馨提示: 如果你得HDFS集群啟用來了Kerberos安全認證,則你應該需要關心以下參數(修改hdfs-site..xml): dfs.web.authentication.kerberos.principal dfs.web.authentication.kerberos.keytab
二.使用HDFS命令行工具通過WebHDFS REST API訪問HDFS實戰案例
使用WebHDFS很簡單,需要做的就是將HDFS文件系統URI替換為HTTP URL,接下來我們看一下幾個案例。
1>.列出"/yinzhengjie"的HDFS目錄所有文件和目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / #需要注意的是,我們在使用命令行工具並沒有指定文件系統的名稱則使用"core-site.xml"文件中"fs.defaultFS"屬性定義的默認文件系統名稱。 Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-21 18:42 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie #使用webhdfs協議訪問HDFS Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
2>.將本地文件上傳到HDFS集群中

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -put /etc/fstab webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/fstab #將本地文件"/etc/fstab"文件上傳到HDFS的"/yinzhengjie/"目錄 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]#
3>.下載HDFS文件系統中的文件或目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d #下載目錄 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll yum.repos.d/ total 40 -rw-r--r-- 1 root root 1664 Aug 31 14:32 CentOS-Base.repo -rw-r--r-- 1 root root 1309 Aug 31 14:32 CentOS-CR.repo -rw-r--r-- 1 root root 649 Aug 31 14:32 CentOS-Debuginfo.repo -rw-r--r-- 1 root root 314 Aug 31 14:32 CentOS-fasttrack.repo -rw-r--r-- 1 root root 630 Aug 31 14:32 CentOS-Media.repo -rw-r--r-- 1 root root 1331 Aug 31 14:32 CentOS-Sources.repo -rw-r--r-- 1 root root 5701 Aug 31 14:32 CentOS-Vault.repo -rw-r--r-- 1 root root 951 Aug 31 14:32 epel.repo -rw-r--r-- 1 root root 1050 Aug 31 14:32 epel-testing.repo [root@hadoop105.yinzhengjie.com ~]#

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 0 drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz #下載文件 [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# ll total 4 -rw-r--r-- 1 root root 69 Aug 31 14:33 wc.txt.gz drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d [root@hadoop105.yinzhengjie.com ~]#
4>.刪除HDFS文件系統中的文件或目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm -r webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d #刪除目錄 20/08/31 14:38:12 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tr ash/Current/yinzhengjie/yum.repos.d [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]#

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz #刪除文件 20/08/31 14:38:28 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tras h/Current/yinzhengjie/wc.txt.gz [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
5>.其它操作
有了上面的4個案例打底,想必接下來讓你自行探索其它使用方法估計問題不大,和我之前分享的hdfs dfs工具的使用方法基本雷同,只不過需要將hdfs協議換成webhdfs協議即可。 博主推薦閱讀: https://www.cnblogs.com/yinzhengjie2020/p/13296680.html
三.使用curl工具通過WebHDFS REST API訪問HDFS實戰案例
WebHDFS真的是一個相當全面的工具,其包括許多用於訪問和使用HDFS數據的命令。接下來我們就來看如何使用curl工具通過WebHDFS REST API訪問HDFS。 關於curl工具的使用我這里就不贅述了,感興趣的小伙伴可以自行參考網上的博客,該工具的基本使用方法查看我的筆記即可。curl常見的選項如下所示: -A/--user-agent <string>: 設置用戶代理發送給服務器 -e/--referer <URL>: 來源網址 --cacert <file>: CA證書 (SSL) -k/--insecure: 允許忽略證書進行 SSL 連接 --compressed: 要求返回是壓縮的格式 -H/--header <line>: 自定義首部信息傳遞給服務器 -i: 顯示頁面內容,包括報文首部信息 -I/--head: 只顯示響應報文首部信息 -D/--dump-header <file>: 將url的header信息存放在指定文件中 --basic: 使用HTTP基本認證 -u/--user <user[:password]>: 設置服務器的用戶和密碼 -L: 如果有3xx響應碼,重新發請求到新位置 -O: 使用URL中默認的文件名保存文件到本地 -o <file>: 將網絡文件保存為指定的文件中 --limit-rate <rate>: 設置傳輸速度 -0/--http1.0: 數字0,使用HTTP 1.0 -v/--verbose: 更詳細 -C: 選項可對文件使用斷點續傳功能 -c/--cookie-jar <file name>: 將url中cookie存放在指定文件中 -x/--proxy <proxyhost[:port]>: 指定代理服務器地址 -X/--request <command>: 向服務器發送指定請求方法 -U/--proxy-user <user:password>: 代理服務器用戶和密碼 -T: 選項可將指定的本地文件上傳到FTP服務器上 --data/-d: 方式指定使用POST方式傳遞數據 -b name=data: 從服務器響應set-cookie得到值,返回給服務器 博主推薦閱讀: https://www.cnblogs.com/yinzhengjie/p/7719804.html
1>.讀取HDFS中的文件(本案例讀取的是"/yinzhengjie/hosts")

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie" #op指定操作,而user.name指定訪問URI的用戶 HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:39:16 GMT Date: Mon, 31 Aug 2020 07:39:16 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:39:16 GMT Date: Mon, 31 Aug 2020 07:39:16 GMT Pragma: no-cache Content-Type: application/octet-stream X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=yinzhengjie&p=yinzhengjie&t=simple&e=1598895556829&s=ak8QrD/3I7HowelGDzH9uvnDeAGBihJhCbCm0wVqS2M="; Path=/; HttpOnly Location: http://hadoop104.yinzhengjie.com:50075/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie&namenoderpcaddress=hadoop101.yinzhengjie.com:9000&offset=0 Content-Length: 0 HTTP/1.1 200 OK Access-Control-Allow-Methods: GET Access-Control-Allow-Origin: * Content-Type: application/octet-stream Connection: close Content-Length: 371 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com [root@hadoop105.yinzhengjie.com ~]#
2>.檢查HDFS目錄的狀態

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-31 14:38 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=LISTSTATUS" #查看"/yinzhengjie"目錄的狀態 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:51:31 GMT Date: Mon, 31 Aug 2020 07:51:31 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:51:31 GMT Date: Mon, 31 Aug 2020 07:51:31 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"FileStatuses":{"FileStatus":[ {"accessTime":1598855175268,"blockSize":536870912,"childrenNum":0,"fileId":16489,"group":"admingroup","length":490,"modificationTime":1598855175823,"owner":"root","pathSuffix":"fstab","perm ission":"644","replication":3,"storagePolicy":0,"type":"FILE"},{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix":"hosts","perm ission":"644","replication":3,"storagePolicy":0,"type":"FILE"}]}} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]#
3>.檢查HDFS文件的狀態

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=GETFILESTATUS" ;echo #查看"/yinzhengjie/hosts"文件的狀態 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 07:58:53 GMT Date: Mon, 31 Aug 2020 07:58:53 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 07:58:53 GMT Date: Mon, 31 Aug 2020 07:58:53 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"FileStatus":{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix" :"","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"}} [root@hadoop105.yinzhengjie.com ~]#
4>.創建目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-21 16:40 /bigdata drwxr-xr-x - root admingroup 0 2020-08-20 19:26 /system drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-31 16:17 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X PUT "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?user.name=root&op=MKDIRS&permissions=751" ;echo #創建"/yinzhengjie/webHDFS"目錄 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 08:14:10 GMT Date: Mon, 31 Aug 2020 08:14:10 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 08:14:10 GMT Date: Mon, 31 Aug 2020 08:14:10 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598897650918&s=rp1JdtIpaV59fm8TFisjCUMH3ARerDWzI4oL+jCezrs="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-21 16:45 /yinzhengjie/hosts drwxr-xr-x - root admingroup 0 2020-08-31 16:14 /yinzhengjie/webHDFS [root@hadoop105.yinzhengjie.com ~]#
5>.創建並寫入數據到文件
我使用的是"Hadoop 2.10.0"版本,在嘗試使用webhdfs官方的方法創建文件或者往已有的文件追加內容均失敗了,官方提供的2個方法需要發送2次HTTP請求,但我在測試多次均無法創建,若有成功的小伙伴請不吝賜教。 參考連接: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Append_to_a_File
6>.刪除目錄或文件

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 3 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts drwxr-xr-x - root admingroup 0 2020-08-31 18:07 /yinzhengjie/webHDFS [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?op=DELETE&user.name=root";echo #刪除目錄 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:07:56 GMT Date: Mon, 31 Aug 2020 10:07:56 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:07:56 GMT Date: Mon, 31 Aug 2020 10:07:56 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904476157&s=4aHgz6EwyJfdmjlwOtkXs+8Je94BybNxDUYoon7FIWE="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]#

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 490 2020-08-31 14:26 /yinzhengjie/fstab -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/fstab?op=DELETE&user.name=root";echo #刪除文件 HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:08:52 GMT Date: Mon, 31 Aug 2020 10:08:52 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:08:52 GMT Date: Mon, 31 Aug 2020 10:08:52 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904532486&s=MCjvGp705lVZcZx7hc5UCeERNoRDGC5rsW5E/USXi6c="; Path=/; HttpOnly Transfer-Encoding: chunked {"boolean":true} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 1 items -rw-r--r-- 3 root admingroup 371 2020-08-31 18:07 /yinzhengjie/hosts [root@hadoop105.yinzhengjie.com ~]#
7>.檢查目錄配額

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf none inf 1 2 742 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:30:13 GMT Date: Mon, 31 Aug 2020 10:30:13 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:30:13 GMT Date: Mon, 31 Aug 2020 10:30:13 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":-1,"spaceConsumed":29631,"spaceQuota":-1,"typeQuota":{}}} [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setSpaceQuota 10g /yinzhengjie/ [root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setQuota 50 /yinzhengjie/ [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 50 47 10 G 10.0 G 1 2 742 /yinzhengjie [root@hadoop105.yinzhengjie.com ~]# [root@hadoop105.yinzhengjie.com ~]# curl -i "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo HTTP/1.1 200 OK Cache-Control: no-cache Expires: Mon, 31 Aug 2020 10:30:52 GMT Date: Mon, 31 Aug 2020 10:30:52 GMT Pragma: no-cache Expires: Mon, 31 Aug 2020 10:30:52 GMT Date: Mon, 31 Aug 2020 10:30:52 GMT Pragma: no-cache Content-Type: application/json X-FRAME-OPTIONS: SAMEORIGIN Transfer-Encoding: chunked {"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":50,"spaceConsumed":29631,"spaceQuota":10737418240,"typeQuota":{}}} [root@hadoop105.yinzhengjie.com ~]#
8>.其它操作
博主推薦閱讀: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html