HDFS編程實踐（Hadoop3.1.3）

本文轉載自查看原文 2021-10-05 11:20 167 HDFS編程實踐

HDFS編程實踐（Hadoop3.1.3)

1,在學習HDFS編程實踐前，我們需要啟動Hadoop（版本是Hadoop3.1.3）。執行如下命令：

cd /usr/local/hadoop 　　 #切換到hadoop的安裝目錄  
./sbin/start-dfs.sh　　　 #啟動hadoop

一、利用Shell命令與HDFS進行交互

Hadoop支持很多Shell命令，其中fs是HDFS最常用的命令，利用fs可以查看HDFS文件系統的目錄結構、上傳和下載數據、創建文件等。

① 查看fs總共支持了哪些命令: ./bin/hadoop fs

② 查看具體某個命令的作用: (例如：我們查看put命令如何使用): ./bin/hadoop fs -help put

1.目錄操作:

(前提切換到hadoop的安裝目錄下)

hadoop fs -ls <path>: 顯示<path>指定的文件的詳細信息（查看文件夾列表）

hadoop fs -mkdir <path>: 創建<path>指定的文件夾 (創建文件夾)

hadoop fs -cat <path>:將<path>指定的文件的內容輸出到標准輸出（stdout）（查看文件內容）

hadoop fs -copyFromLocal <localsrc> <dst>:將本地源文件<localsrc>復制到路徑<dst>指定的文件或文件夾中（復制文件）

● 在配置好Hadoop集群之后，可以通過瀏覽器登錄“http://localhost:9870”訪問HDFS文件系統

● 通過Web界面的”Utilities”菜單下面的“Browse the filesystem”查看文件

① 在HDFS中為hadoop用戶創建一個用戶目錄:

cd /usr/local/hadoop

./bin/hdfs dfs -mkdir -p /user/hadoop

■ 該命令中表示在HDFS中創建一個“/user/hadoop”目錄，“–mkdir”是創建目錄的操作，“-p”表示如果是多級目錄，則父目錄和子目錄一起創建，

這里“/user/hadoop”就是一個多級目錄，因此必須使用參數“-p”，否則會出錯。

② 查看目錄下的內容：./bin/hdfs dfs -ls .

■ 該命令中 . 表示HDFS中的當前用戶目錄, 即 “/user/hadoop”目錄

■ 列出HDFS上的所有目錄命令：./bin/hdfs dfs -ls

③ 創建一個input目錄：./bin/hdfs dfs -mkdir input

■ 在HDFS的根目錄下創建一個名稱為input的目錄：./bin/hdfs dfs -mkdir /input

④ rm命令刪除一個目錄或文件（刪除 input 目錄）：./bin/hdfs dfs -rm -r /input

2.文件操作:

■ 在本地Linux文件系統的“/home/hadoop/”目錄下創建一個文件myLocalFile.txt，里面可以隨意輸入一些內容，Linux創建文件命令： touch filename

① 上傳：上傳本地文件（myLocalFile.txt）到HDFS：（上傳到HDFS的“/user/hadoop/input/”目錄下：）./bin/hdfs dfs -put /home/hadoop/myLocalFile.txt input

■ ■ ■ 向HDFS中上傳任意文本文件，如果指定的文件在HDFS中已經存在，由用戶指定是追加到原有文件末尾還是覆蓋原有的文件:

bug：Comman ‘hdfs’ not found,did you mean: command ‘hfs’ from deb hfsutils-tcltk…

原因是沒有在bin目錄下設置PATH，因此相關hadoop或者hdfs的命令都無法正常使用。
解決：
① sudo vi /etc/profile
② 然后在最下面加入一行配置PATH：（i 鍵進入編輯狀態，Esc 退出編輯鍵，ZZ（兩個大寫的ZZ）保存並退出vim）
 export PATH=/usr/local/hadoop/bin:$PAT

③ 使配置立即生效：source /etc/profile

# ■ 向HDFS中上傳任意文本文件，如果指定的文件在HDFS中已經存在，由用戶指定是追加到原有文件末尾還是覆蓋原有的文件:
（這里的hello是hadoop上的文件（/user/hadoop/hello.txt）,local.txt 是Linux 本地文件）

if $(hdfs dfs -test -e hello.txt);     　　　　　　　　　　 #檢查hadoop系統上是否存在hello文件了
then $(hdfs dfs -appendToFile local.txt hello.txt); 　　　#存在，將本地文件local.txt 中的內容追加到 hello.txt 中
else $(hdfs dfs -copyFromLocal -f local.txt hello.txt);　#不存在，將本地文件local.txt 覆蓋替換到 hadoop系統中
fi

（text.txt 是Hadoop 系統中的一個文件， ~/下載/test.txt 是本地文件）

② ■ ■ ■ 從 HDFS 中下載指定文件，如果本地文件與要下載的文件名稱相同，則自動對下載的文件重命名；

（file：///下載/text.txt 是本地文件）if $(hdfs dfs -test -e file:///下載/text.txt) # hadoop 系統上是否存在文件名（與本地系統中的text.txt）相同？

③ ■ ■ ■ 將 HDFS 中指定文件的內容輸出到終端中; -cat 命令啦 ./bin/hdfs dfs -cat myHadoopFile

④ ■ ■ ■ 顯示 HDFS 中指定的文件的讀寫權限、大小、創建時間、路徑等信息; -ls 命令啦 ./bin/hdfs dfs -ls myHadoopFile

⑤ ■ ■ ■ 給定 HDFS 中某一個目錄，輸出該目錄下的所有文件的讀寫權限、大小、創建時間、路徑等信息，

如果該文件是目錄，則遞歸輸出該目錄下所有文件相關信息；（-ls 命令的遞歸選項啦 -R） ./bin/hdfs dfs -ls -R myHadoopDir

⑥ ■ ■ ■ 提供一個 HDFS 內的文件的路徑，對該文件進行創建和刪除操作； -rm 命令啦 ./bin/hdfs dfs -rm myHadoopFile

（如果文件所在目錄不存在，則自動創建目錄）

⑦ 供一個 HDFS 的目錄的路徑，對該目錄進行創建和刪除操作。創建目錄時，如果目錄文件所在目錄不存在，則自動創建相應目錄；

刪除目錄時，由用戶指定當該目錄不為空時是否還刪除該目錄； -rmr 命令

例如：hadoop fs -rmr myHadoopDir

⑧ 向 HDFS 中指定的文件追加內容，由用戶指定內容追加到原有文件的開頭或結尾； ./bin/hdfs dfs -appendToFile local.txt ./myHadoopFile.txt

(注意：appendToFile 是將當地文件內容追加的到 hadoop 上的文件（不能hadoop上的文件1 追加給 hadoop上的文件2）)

⑨ 刪除 HDFS 中指定的文件； -rm 命令即可

⑩ 在HDFS中，將文件從源路徑移動到目的路徑； -mv 命令<src> <dest>

例如：hadoop fs -mv /usr/local/hadoop/test.txt /usr/local/hadoop/hadoop_tmp/hello.txt

● 使用ls命令查看一下文件是否成功上傳到HDFS中: ./bin/hdfs dfs -ls input

● 使用-cat 命令查看HDFS中的myLocalFile.txt 的內容: ./bin/hdfs dfs -cat input/myLocalFile.txt

● 上傳：上傳本地文件（myLocalFile.txt）到HDFS：（上傳到HDFS的“/user/hadoop/input/”目錄下：）./bin/hdfs dfs -put /home/hadoop/myLocalFile.txt input

● 下載：從HDFS 下載文件到本地：（把HDFS中的myLocalFile.txt文件下載到本地文件系統中的“/home/hadoop/下載/”）： ./bin/hdfs dfs -get input/myLocalFile.txt /home/hadoop/下載

● 拷貝：把文件從HDFS中的一個目錄拷貝到HDFS中的另外一個目錄

（比如，如果要把HDFS的“/user/hadoop/input/myLocalFile.txt”文件，拷貝到HDFS的另外一個目錄“/input”中）： ./bin/hdfs dfs -cp input/myLocalFile.txt /input

● 追加內容：向HDFS中指定的文件追加內容，由用戶指定內容追加到原有文件的開頭或結尾： ./bin/hdfs dfs -appendToFile local.txt ./myHadoopFile.txt

(注意：appendToFile 是將當地文件內容追加的到 hadoop 上的文件（不能hadoop上的文件1 追加給 hadoop上的文件2）)

二、利用Web界面管理HDFS

利用Linux自帶的火狐瀏覽器，WEB界面的訪問地址是http://localhost:9870。通過Web界面的”Utilities”菜單下面的“Browse the filesystem”查看文件。

二、編程實現以下指定功能，和使用 Hadopp 提供的 Shell 命令完成相同的任務。

1. 向HDFS中上傳任意文本文件，如果指定的文件在 HDFS 中已經存在，由用戶指定是追加到原有文件末尾還是覆蓋原有的文件。

hadoop fs -put /User/Binguner/Desktop/test.txt /test
hadoop fs -appendToFile /User/Binguner/Desktop/test.txt /test/test.txt
hadoop fs -copyFromLocal -f /User/Binguner/Desktop/test.txt / input/test.txt

    /**
     * @param fileSystem 
     * @param srcPath 本地文件地址
     * @param desPath 目標文件地址
     */
    private static void test1(FileSystem fileSystem,Path srcPath, Path desPath){
        try {
            if (fileSystem.exists(new Path("/test/test.txt"))){
                System.out.println("Do you want to overwrite the existed file? ( y / n )");
                if (new Scanner(System.in).next().equals("y")){
                    fileSystem.copyFromLocalFile(false,true,srcPath,desPath);
                }else {
                    FileInputStream inputStream = new FileInputStream(srcPath.toString());
                    FSDataOutputStream outputStream  = fileSystem.append(new Path("/test/test.txt"));
                    byte[] bytes = new byte[1024];
                    int read = -1;
                    while ((read = inputStream.read(bytes)) > 0){
                        outputStream.write(bytes,0,read);
                    }
                    inputStream.close();
                    outputStream.close();
                }
            }else {
                fileSystem.copyFromLocalFile(srcPath,desPath);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

2. 從HDFS中下載指定文件，如果本地文件與要下載的文件名稱相同，則自動對下載的文件重命名。

hadoop fs -copyToLocal /input/test.txt /User/binguner/Desktop/test.txt

    /**
     * @param fileSystem
     * @param remotePath HDFS 中文件的地址
     * @param localPath 本地要保存的文件的地址
     */
    private static void test2(FileSystem fileSystem,Path remotePath, Path localPath){
        try {
            if (fileSystem.exists(remotePath)){
                fileSystem.copyToLocalFile(remotePath,localPath);
            }else {
                System.out.println("Can't find this file in HDFS!");
            }
        } catch (FileAlreadyExistsException e){
            try {
                System.out.println(localPath.toString());
                fileSystem.copyToLocalFile(remotePath,new Path("src/test"+ new Random().nextInt()+".txt"));
            } catch (IOException e1) {
                e1.printStackTrace();
            }

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

3. 將HDFS中指定文件的內容輸出到終端中。

hadoop fs -cat /test/test.txt

    /** 
     * @param fileSystem
     * @param remotePath 目標文件地址
     */
    private static void test3(FileSystem fileSystem,Path remotePath){
        try {
            FSDataInputStream inputStream= fileSystem.open(remotePath);
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
            String line;
            while ((line = bufferedReader.readLine()) != null){
                System.out.println(line);
            }

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

4. 顯示HDFS中指定的文件的讀寫權限、大小、創建時間、路徑等信息。

hadoop fs -ls -h /test/test.txt

    /**
     * @param fileSystem
     * @param remotePath 目標文件地址
     */
    private static void test4(FileSystem fileSystem, Path remotePath){
        try {
            FileStatus[] fileStatus = fileSystem.listStatus(remotePath);
            for (FileStatus status : fileStatus){
                System.out.println(status.getPermission());
                System.out.println(status.getBlockSize());
                System.out.println(status.getAccessTime());
                System.out.println(status.getPath());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

5. 給定HDFS中某一個目錄，輸出該目錄下的所有文件的讀寫權限、大小、創建時間、路徑等信息，如果該文件是目錄，則遞歸輸出該目錄下所有文件相關信息。

hadoop fs -lsr -h /

    /**
     * @param fileSystem
     * @param remotePath 目標文件地址
     */
    private static void test5(FileSystem fileSystem, Path remotePath){
        try {
            RemoteIterator<LocatedFileStatus> iterator = fileSystem.listFiles(remotePath,true);
            while (iterator.hasNext()){
                FileStatus status = iterator.next();
                System.out.println(status.getPath());
                System.out.println(status.getPermission());
                System.out.println(status.getLen());
                System.out.println(status.getModificationTime());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

6. 提供一個HDFS內的文件的路徑，對該文件進行創建和刪除操作。如果文件所在目錄不存在，則自動創建目錄。

hadoop fs -touchz /test/test.txt
hadoop fs -mkdir /test
hadoop fs -rm -R /test/text.txt

    /**
     * @param fileSystem
     * @param remoteDirPath 目標文件夾地址
     * @param remoteFilePath 目標文件路徑
     */
    private static void test6(FileSystem fileSystem, Path remoteDirPath, Path remoteFilePath){
        try {
            if (fileSystem.exists(remoteDirPath)){
                System.out.println("Please choose your option: 1.create. 2.delete");
                int i = new Scanner(System.in).nextInt();
                switch (i){
                    case 1:
                        fileSystem.create(remoteFilePath);
                        break;
                    case 2:
                        fileSystem.delete(remoteDirPath,true);
                        break;
                }
            }else {
                fileSystem.mkdirs(remoteDirPath);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

7. 提供一個 HDFS 的文件的路徑，對該文件進行創建和刪除操作。創建目錄時，如果該目錄文件所在目錄不存在則自動創建相應目錄；刪除目錄時，由用戶指定該目錄不為空時是否還刪除該目錄。

hadoop fs -touchz /test/test.txt
hadoop fs -mkdir /test
hadoop fs -rm -R /test/text.txt

    /**
     * @param fileSystem
     * @param remotePath 目標文件夾地址
     */
    private static void test7(FileSystem fileSystem, Path remotePath){
        try {
            if (!fileSystem.exists(remotePath)){
                System.out.println("Can't find this path, the path will be created automatically");
                fileSystem.mkdirs(remotePath);
                return;
            }
            System.out.println("Do you want to delete this dir? ( y / n )");
            if (new Scanner(System.in).next().equals("y")){
                FileStatus[] iterator = fileSystem.listStatus(remotePath);
                if (iterator.length != 0){
                    System.out.println("There are some files in this dictionary, do you sure to delete all? (y / n)");
                    if (new Scanner(System.in).next().equals("y")){
                        if (fileSystem.delete(remotePath,true)){
                            System.out.println("Delete successful");
                            return;
                        }
                    }
                }
                if (fileSystem.delete(remotePath,true)){
                    System.out.println("Delete successful");
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

8. 向 HDFS 中指定的文件追加內容，由用戶指定追加到原有文件的開頭或結尾

hadoop fs -get text.txt
cat text.txt >> local.txt
hadoop fs -copyFromLocal -f text.txt text.txt

    /**
     * @param fileSystem
     * @param remotePath HDFS 中文件到路徑
     * @param localPath 本地文件路徑
     */
    private static void test8(FileSystem fileSystem,Path remotePath, Path localPath){
        try {
            if (!fileSystem.exists(remotePath)){
                System.out.println("Can't find this file");
                return;
            }
            System.out.println("input 1 or 2 , add the content to the remote file's start or end");
            switch (new Scanner(System.in).nextInt()){
                case 1:
                    fileSystem.moveToLocalFile(remotePath, localPath);
                    FSDataOutputStream fsDataOutputStream = fileSystem.create(remotePath);
                    FileInputStream fileInputStream = new FileInputStream("/Users/binguner/IdeaProjects/HadoopDemo/src/test2.txt");
                    FileInputStream fileInputStream1 = new FileInputStream("/Users/binguner/IdeaProjects/HadoopDemo/src/test.txt");
                    byte[] bytes = new byte[1024];
                    int read = -1;
                    while ((read = fileInputStream.read(bytes)) > 0) {
                        fsDataOutputStream.write(bytes,0,read);
                    }
                    while ((read = fileInputStream1.read(bytes)) > 0){
                        fsDataOutputStream.write(bytes,0,read);
                    }
                    fileInputStream.close();
                    fileInputStream1.close();
                    fsDataOutputStream.close();
                    break;
                case 2:
                    FileInputStream inputStream = new FileInputStream("/Users/binguner/IdeaProjects/HadoopDemo/"+localPath.toString());
                    FSDataOutputStream outputStream = fileSystem.append(remotePath);
                    byte[] bytes1 = new byte[1024];
                    int read1 = -1;
                    while ((read1 = inputStream.read(bytes1)) > 0){
                        outputStream.write(bytes1,0,read1);
                    }
                    inputStream.close();
                    outputStream.close();
                    break;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

9. 刪除 HDFS 中指定的文件。

hadoop fs -rm -R /test/test.txt

    private static void test9(FileSystem fileSystem,Path remotePath){
        try {
            if(fileSystem.delete(remotePath,true)){
                System.out.println("Delete success");
            }else {
                System.out.println("Delete failed");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

10. 在 HDFS 中將文件從源路徑移動到目的路徑。

hadoop fs -mv /test/test.txt /test2

    /**
     * @param fileSystem
     * @param oldRemotePath old name
     * @param newRemotePath new name
     */
    private static void test10(FileSystem fileSystem, Path oldRemotePath, Path newRemotePath){
        try {
            if (fileSystem.rename(oldRemotePath,newRemotePath)){
                System.out.println("Rename success");
            }else {
                System.out.println("Rename failed");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

三、利用Java API與HDFS進行交互

(一) 在Ubuntu中安裝Eclipse/idea

1. 在Eclipse中創建項目

2. 為項目添加需要用到的JAR包

為了編寫一個能夠與HDFS交互的Java應用程序，一般需要向Java工程中添加以下JAR包：
（1）“/usr/local/hadoop/share/hadoop/common”目錄下的所有JAR包，包括hadoop-common-3.1.3.jar、hadoop-common-3.1.3-tests.jar、haoop-nfs-3.1.3.jar和haoop-kms-3.1.3.jar，注意，不包括目錄jdiff、lib、sources和webapps；
（2）“/usr/local/hadoop/share/hadoop/common/lib”目錄下的所有JAR包；
（3）“/usr/local/hadoop/share/hadoop/hdfs”目錄下的所有JAR包，注意，不包括目錄jdiff、lib、sources和webapps；
（4）“/usr/local/hadoop/share/hadoop/hdfs/lib”目錄下的所有JAR包。

3. 1.編寫Java應用程序

例如任務：現在要執行的任務是：假設在目錄“hdfs://localhost:9000/user/hadoop”下面有幾個文件，分別是file1.txt、file2.txt、file3.txt、file4.abc和file5.abc，

這里需要從該目錄中過濾出所有后綴名不為“.abc”的文件，對過濾之后的文件進行讀取，並將這些文件的內容合並到文件“hdfs://localhost:9000/user/hadoop/merge.txt”中。

■ 准備工作：HDFS的“/user/hadoop”目錄下已經存在file1.txt、file2.txt、file3.txt、file4.abc和file5.abc，每個文件里面有內容。這里，假設文件內容如下：
file1.txt的內容是： this is file1.txt
file2.txt的內容是： this is file2.txt
file3.txt的內容是： this is file3.txt
file4.abc的內容是： this is file4.abc
file5.abc的內容是： this is file5.abc

import java.io.IOException;
import java.io.PrintStream;
import java.net.URI;
 
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
 
/**
 * 過濾掉文件名滿足特定條件的文件 
 */
class MyPathFilter implements PathFilter {
     String reg = null; 
     MyPathFilter(String reg) {
          this.reg = reg;
     }
     public boolean accept(Path path) {
        if (!(path.toString().matches(reg)))
            return true;
        return false;
    }
}
/***
 * 利用FSDataOutputStream和FSDataInputStream合並HDFS中的文件
 */
public class MergeFile {
    Path inputPath = null; //待合並的文件所在的目錄的路徑
    Path outputPath = null; //輸出文件的路徑
    public MergeFile(String input, String output) {
        this.inputPath = new Path(input);
        this.outputPath = new Path(output);
    }
    public void doMerge() throws IOException {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS","hdfs://localhost:9000");
          conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
        FileSystem fsSource = FileSystem.get(URI.create(inputPath.toString()), conf);
        FileSystem fsDst = FileSystem.get(URI.create(outputPath.toString()), conf);
                //下面過濾掉輸入目錄中后綴為.abc的文件
        FileStatus[] sourceStatus = fsSource.listStatus(inputPath,
                new MyPathFilter(".*\\.abc")); 
        FSDataOutputStream fsdos = fsDst.create(outputPath);
        PrintStream ps = new PrintStream(System.out);
        //下面分別讀取過濾之后的每個文件的內容，並輸出到同一個文件中
        for (FileStatus sta : sourceStatus) {
            //下面打印后綴不為.abc的文件的路徑、文件大小
            System.out.print("路徑：" + sta.getPath() + "    文件大小：" + sta.getLen()
                    + "   權限：" + sta.getPermission() + "   內容：");
            FSDataInputStream fsdis = fsSource.open(sta.getPath());
            byte[] data = new byte[1024];
            int read = -1;
 
            while ((read = fsdis.read(data)) > 0) {
                ps.write(data, 0, read);
                fsdos.write(data, 0, read);
            }
            fsdis.close();          
        }
        ps.close();
        fsdos.close();
    }
    public static void main(String[] args) throws IOException {
        MergeFile merge = new MergeFile(
                "hdfs://localhost:9000/user/hadoop/",
                "hdfs://localhost:9000/user/hadoop/merge.txt");
        merge.doMerge();
    }
}

3. 2.編寫Java應用程序

例如任務：現在要執行的任務是：編程實現一個類"MyFSDataInputStream"，該類繼承"org.apache.hadoop.fs.FSDataInputStream"，要求如下：實現按行讀取 HDFS 中指定文件的方法"readLine()"，如果讀到文件末尾，則返回空，否則返回文件一行的文本。同時實現緩存功能，即用“MyFSDataInputStream” 讀取若干字節數據時，首先查找緩存，若緩存中有所需的數據，則直接由緩存提供，否則從HDFS中讀取數據。

參考HDFS 讀取數據：

        import java.io.BufferedReader;
        import java.io.InputStreamReader;
 
        import org.apache.hadoop.conf.Configuration;
        import org.apache.hadoop.fs.FileSystem;
        import org.apache.hadoop.fs.Path;
        import org.apache.hadoop.fs.FSDataInputStream;
 
        public class Chapter3 {
                public static void main(String[] args) {
                        try {
                                Configuration conf = new Configuration();
                                conf.set("fs.defaultFS","hdfs://localhost:9000");
                                conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
                                FileSystem fs = FileSystem.get(conf);
                                Path file = new Path("test"); 
                                FSDataInputStream getIt = fs.open(file);
                                BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
                                String content = d.readLine(); //讀取文件一行
                                System.out.println(content);
                                d.close(); //關閉文件
                                fs.close(); //關閉hdfs
                        } catch (Exception e) {
                                e.printStackTrace();
                        }
                }
        }

正解：

package Second;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class MyFsDataInputStream extends FSDataInputStream{
    public MyFsDataInputStream(InputStream in) {
        super(in);
    }
    public static String readline(Configuration conf,String filename) throws IOException
    {
        Path filename1=new Path(filename);
        FileSystem fs=FileSystem.get(conf);
        FSDataInputStream in=fs.open(filename1);
        BufferedReader d=new BufferedReader(new InputStreamReader(in));
        String line=d.readLine();
        if (line!=null) {
            d.close();
            in.close();
            return line;
        }else
            return null;
    }
    public static void main(String[] args) throws IOException {
        Configuration conf=new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
        FileSystem fs=FileSystem.get(conf);
        String filename="/user/hadoop/myLocalFile.txt";
        System.out.println("讀取文件："+filename);
        String o=MyFsDataInputStream.readline(conf, filename);
        System.out.println(o+"\n"+"讀取完成");
    }
}

3. 3.編寫Java應用程序

例如任務：現在要執行的任務是：查看Java幫助手冊或其它資料，用”java.net.URL”和“org.apache.hadoop.fs.FsURLStreamHandler
Factory”編程完成輸出HDFS中指定文件的文本到終端中。

package Second;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import org.apache.hadoop.io.IOUtils;
public class FSUrl {
    static {
        URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
    }
    public static void cat(String filename) throws MalformedURLException, IOException
    {
        InputStream in=new URL("hdfs","localhost",9000,filename).openStream();
        IOUtils.copyBytes(in, System.out,4096,false);
        IOUtils.closeStream(in);
    }
    public static void main(String[] args) throws MalformedURLException, IOException {
        String filename="/user/hadoop/myLocalFile.txt";
        System.out.println("讀取文件"+filename);
        FSUrl.cat(filename+"\n讀取完成");
    }
}

最后，其他問題可以參考文章：《參考大數據廈門大學林子雨編著的《大數據技術原理與應用（第3版）》中第三課《HDFS編程實踐（Hadoop3.1.3）》遇到的bug》

本文參考文章：

《hdfs報錯Command ‘hdfs‘ not found, did you mean》 https://blog.csdn.net/Y_6155/article/details/110108809

《第三章熟悉常用的HDFS操作》https://www.cnblogs.com/qq8675/p/8964391.html

《熟悉常用的 HDFS 操作》https://blog.csdn.net/wozenmezhemeshuai/article/details/79937342

《基於JAVA的HDFS文件操作》 https://blog.csdn.net/miss_bear/article/details/105344901

《HDFS編程實踐（Hadoop3.1.3）_廈大數據庫實驗室博客 (xmu.edu.cn)》

《實驗二熟悉常用的 HDFS 操作》https://www.pianshen.com/article/9678302664/

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 HDFS【hadoop3.1.3 windows開發環境搭建】 SpringBoot集成Hadoop3.1.3 Hadoop詳解(04-1) - 基於hadoop3.1.3配置Windows10本地開發運行環境 Bug | Hadoop3.1.3 啟動集群后沒有NameNode節點出現 storage directory does not exist or is not accessible Docker下的Hadoop3.1.3安裝教程(包括java環境配置) Hadoop3.1.3安裝教程_單機/偽分布式配置 Hadoop編程實現之HDFS Hadoop-3.1.3安裝 Hadoop HDFS編程 API入門系列之合並小文件到HDFS（三） hadoop集群搭建(hadoop-3.1.3)