HDFS命令行操作和 api操作

本文轉載自查看原文 2020-01-17 12:29 793

HDFS，是Hadoop Distributed File System的簡稱，是Hadoop抽象文件系統的一種實現。Hadoop抽象文件系統可以與本地系統、Amazon S3等集成，甚至可以通過Web協議（webhsfs）來操作。HDFS的文件分布在集群機器上，同時提供副本進行容錯及可靠性保證。例如客戶端寫入讀取文件的直接操作都是分布在集群各個機器上的，沒有單點性能壓力。

HDFS相關的搭建可以看我前面的一篇博文，我們今天主要來講下怎么操作hdfs的api和 hdfs命令行，

java內操作HDFS需要先配置倉庫

<repositories>
  <repository>
	<id>cloudera</id>
	<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
  </repository>
</repositories>
//導包
<dependency>
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-client</artifactId>
  <version>${hadoop.version}</version>
</dependency>

例子：通過api創建目錄

this.configuration = new Configuration();
this.fileSystem = FileSystem.get(new URI(this.HDFS_PATH),configuration,"hadoop");
Path path = new Path("/hdfsapi/test");
boolean result  = fileSystem.mkdirs(path);

通過API讀取文件，回寫到本地

Path path = new Path("/gwyy.txt");
FSDataInputStream fsDataInputStream = fileSystem.open(path);
FileOutputStream fileOutputStream = new FileOutputStream(new File("a.txt"));
byte[] buffer = new byte[1024];
int length = 0;
StringBuffer sb = new StringBuffer();
while( (  length = fsDataInputStream.read(buffer)) != -1) {
	sb.append(new String(buffer,0,buffer.length));
fileOutputStream.write(buffer,0,buffer.length);
}
System.out.println(sb.toString());

HDFS 創建文件並且寫入內容

FSDataOutputStream out = fileSystem.create(new Path("/fuck.txt"));
out.writeUTF("aaabbb");
out.flush();
out.close();

HDFS 重名

boolean a = fileSystem.rename(new Path("/fuck.txt"),new Path("/fuck.aaa"));
System.out.println(a);

HDFS拷貝文件

fileSystem.copyFromLocalFile(new Path("a.txt"),new Path("/copy_a.txt"));

HDFS上傳大文件

InputStream in = new BufferedInputStream(new FileInputStream(new File("hive-1.1.0-cdh5.15.1.tar.gz")));
Path dst = new Path("/hive.tar.gz");
//顯示進度條
FSDataOutputStream out = fileSystem.create(dst, new Progressable() {
	@Override
	public void progress() {
		System.out.flush();
		System.out.print('.');
	}
});
byte[] buffer = new byte[4096];
int length = 0;
//寫入到 hdfs
while((length = in.read(buffer,0,buffer.length)) != -1) {
	out.write(buffer,0,buffer.length);
}

HDFS下載文件

fileSystem.copyToLocalFile(new Path("/fuck.aaa"),new Path("./"));

HDFS 列出所有文件

FileStatus[] fileStatuses = fileSystem.listStatus(new Path("/"));
for (FileStatus f:fileStatuses) {
	System.out.println(f.getPath());
}

HDFS 遞歸列出文件

RemoteIterator<LocatedFileStatus>  remoteIterator = fileSystem.listFiles(new Path("/"),true);
while(remoteIterator.hasNext()) {
	LocatedFileStatus file =  remoteIterator.next();
	System.out.println(file.getPath());
}

HDFS查看文件區塊

FileStatus fileStatus = fileSystem.getFileStatus(new Path("/jdk-8u221-linux-x64.tar.gz"));
BlockLocation[] blockLocations = fileSystem.getFileBlockLocations(fileStatus,0,fileStatus.getLen());
//查看區塊
for (BlockLocation b:blockLocations) {
	for (String name:b.getNames()) {
		System.out.println(name + b.getOffset() + b.getLength());
	}
}

HDFS刪除文件

如果路徑是目錄並設置為*如果為true，則刪除目錄，否則引發異常。在*對於文件，遞歸可以設置為true或false。
boolean a = fileSystem.delete(new Path("/gwyy.txt"),true);
System.out.println(a);

下面我們介紹下HDFS的命令行操作

查看 hdfs 文件根目錄

hadoop fs -ls /

上傳文件到 hdfs的根目錄

hadoop fs -put  gwyy.txt  /

從本地拷貝文件到hdfs

hf -copyFromLocal xhc.txt  /

####從本地移動文件到hdfs 本地文件刪除 hf -moveFromLocal a.txt /

查看文件內容

hadoop fs -cat /gwyy.txt
hadoop fs -text  /gwyy.txt

從 hdfs里拿文件到本地

hadoop fs -get /a.txt  ./

HDFS創建文件夾

hadoop fs -mkdir  /hdfs-test

從A文件夾移動到B文件夾

hadoop fs -mv /a.txt  /hdfs-test/a.txt

文件復制操作

hadoop fs -cp /hdfs-test/a.txt /hdfs-test/a.txt.back

把多個文件合並到一起導出來

hadoop fs -getmerge /hdfs-test ./t.txt

刪除一個文件

 hf -rm /hdfs-test/a.txt.back

刪除一個目錄

hadoop fs -rmdir /hdfs-test   只能刪除空目錄
hadoop fs -rm -r /hdfs-test  刪除目錄不管有沒有東西都刪

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 HDFS的命令行操作 HDFS基本命令行操作及上傳文件的簡單API HDFS基本命令行操作及上傳文件的簡單API HDFS之一：hdfs命令行操作 HDFS文件操作(命令行) HDFS 命令行基本操作 Docker 安裝Hadoop HDFS命令行操作 cassandra命令行操作命令行操作flask php命令行操作

HDFS命令行操作 和 api操作