利用HDFS實現ElasticSearch7.2容災方案


利用HDFS實現ElasticSearch7.2容災方案

前言

​ Elasticsearch 副本提供了高可靠性,它們讓你可以容忍零星的節點丟失而不會中斷服務。但是,副本並不提供對災難性故障的保護。對這種情況,就需要的是對集群真正的備份(在某些東西確實出問題的時候有一個完整的拷貝)。

​ 案例模擬ElasticSearch7.2集群環境,采用snapshot API基於快照的方式備份集群。

​ 案例演示HDFS分布式文件系統作為倉庫舉例。

快照版本兼容

備份集群

HDFS文件系統

軟件下載

下載地址

hadoop-3.3.0.tar.gz

JDK環境

hadoop java編寫,運行需依賴jvm

jdk-8u161-linux-x64.tar.gz

配置系統環境變量

#JAVA
export JAVA_HOME=/home/hadoop/jdk1.8.0_161
export CLASSPATH=$JAVA_HOME/libdt.jar:$JAVA_HOME/tools.jar
#hadoop
export HADOOP_HOME=/home/hadoop/hadoop-3.3.0
export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

hadoop配置

hadoop-3.3.0/etc/hadoop 的目錄下

配置JAVA_HOME

hadoop-env.sh

export JAVA_HOME=/home/hadoop/jdk1.8.0_161

配置核心組件文件

core-site.xml需要在 之間添加

<property>
        <name>fs.defaultFS</name>
        <value>hdfs://172.16.176.103:9000</value>
</property>
<property>
        <name>hadoop.tmp.dir</name>
        <value>/data</value>
</property>

配置文件系統

hdfs-site.xml需要在 之間添加

<!--namenode-->
<property>
        <name>dfs.namenode.name.dir</name>
        <value>/data/namenode</value>
</property>
<!--datanode-->
<property>
        <name>dfs.datanode.data.dir</name>
        <value>/data/datanode</value>
</property>
<!--副本數,默認1-->
<property>
   		<name>dfs.replication</name>
        <value>1</value>
</property>
<!--禁用權限檢查,配合es-->
<property>
  <name>dfs.permissions</name>
  <value>false</value>
</property>

配置mapred

mapred-site.xml

<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
</property>

配置 yarn-site.xml

yarn-site.xml

<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>elasticsearch01</value>
</property>

格式化文件系統

hdfs namenode -format

啟動hdfs

start-dfs.sh

$ start-dfs.sh 
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [host103]
Starting datanodes
Starting secondary namenodes [host103]

訪問

http://localhost:9870/

ES插件安裝

集群中每個節點都必須安裝hdfs插件,安裝后需重啟ES

插件下載

插件版本和ES版本相對應

下載地址

repository-hdfs-7.2.0.zip

插件安裝

提前下載軟件包,離線安裝

集群中各節點依次安裝

sudo bin/elasticsearch-plugin install file:///path/to/plugin.zip

$ ./elasticsearch-plugin install file:///home/es/repository-hdfs-7.2.0.zip 
-> Downloading file:///home/es/repository-hdfs-7.2.0.zip
[=================================================] 100%   
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.lang.RuntimePermission accessClassInPackage.sun.security.krb5
* java.lang.RuntimePermission accessDeclaredMembers
* java.lang.RuntimePermission getClassLoader
* java.lang.RuntimePermission loadLibrary.jaas
* java.lang.RuntimePermission loadLibrary.jaas_nt
* java.lang.RuntimePermission loadLibrary.jaas_unix
* java.lang.RuntimePermission setContextClassLoader
* java.lang.RuntimePermission shutdownHooks
* java.lang.reflect.ReflectPermission suppressAccessChecks
* java.net.SocketPermission * connect,resolve
* java.net.SocketPermission localhost:0 listen,resolve
* java.security.SecurityPermission insertProvider.SaslPlainServer
* java.security.SecurityPermission putProviderProperty.SaslPlainServer
* java.util.PropertyPermission * read,write
* javax.security.auth.AuthPermission doAs
* javax.security.auth.AuthPermission getSubject
* javax.security.auth.AuthPermission modifyPrincipals
* javax.security.auth.AuthPermission modifyPrivateCredentials
* javax.security.auth.AuthPermission modifyPublicCredentials
* javax.security.auth.PrivateCredentialPermission javax.security.auth.kerberos.KerberosTicket * "*" read
* javax.security.auth.PrivateCredentialPermission javax.security.auth.kerberos.KeyTab * "*" read
* javax.security.auth.PrivateCredentialPermission org.apache.hadoop.security.Credentials * "*" read
* javax.security.auth.kerberos.ServicePermission * initiate
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.

Continue with installation? [y/N]y
-> Installed repository-hdfs
$ 

創建倉庫

  • 創建
PUT _snapshot/my_hdfs_repository
{
  "type": "hdfs",	--類型
  "settings": {
    "uri": "hdfs://172.16.176.103:9000/",	--hdfs訪問url
    "path": "/data",
    "conf.dfs.client.read.shortcircuit": "false"
  }
}
  • 查看
GET /_snapshot
{
  "my_hdfs_repository" : {
    "type" : "hdfs",
    "settings" : {
      "path" : "/data",
      "uri" : "hdfs://172.16.176.103:9000/",
      "conf" : {
        "dfs" : {
          "client" : {
            "read" : {
              "shortcircuit" : "false"
            }
          }
        }
      }
    }
  }
}

創建快照

  • 創建快照

不等待快照完成,即刻返回結果

PUT _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12
{
    "indices": "i_xfjbblxt_cxfw_xfj_d12"
}
  • 查看快照當前狀態
GET _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12
{
  "snapshots" : [
    {
      "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12",
      "uuid" : "-BS9XjxvS1Sp6wW_bT02lA",
      "version_id" : 7020099,
      "version" : "7.2.0",
      "indices" : [
        "i_xfjbblxt_cxfw_xfj_d12"
      ],
      "include_global_state" : true,
      "state" : "IN_PROGRESS",	--正在做快照中
      "start_time" : "2020-10-12T14:04:49.425Z",	--開始時間
      "start_time_in_millis" : 1602511489425,
      "end_time" : "1970-01-01T00:00:00.000Z",
      "end_time_in_millis" : 0,
      "duration_in_millis" : -1602511489425,
      "failures" : [ ],
      "shards" : {
        "total" : 0,
        "failed" : 0,
        "successful" : 0
      }
    }
  ]
}

  • 完成狀態
{
  "snapshots" : [
    {
      "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12",	--快照名稱
      "uuid" : "-BS9XjxvS1Sp6wW_bT02lA",
      "version_id" : 7020099,
      "version" : "7.2.0",
      "indices" : [
        "i_xfjbblxt_cxfw_xfj_d12"	--索引
      ],
      "include_global_state" : true,
      "state" : "SUCCESS",	--快照成功
      "start_time" : "2020-10-12T14:04:49.425Z",	--開始時間
      "start_time_in_millis" : 1602511489425,	--開始時間戳
      "end_time" : "2020-10-12T14:24:33.942Z",	--結束時間
      "end_time_in_millis" : 1602512673942,	--結束時間戳
      "duration_in_millis" : 1184517,	--耗時(毫秒)
      "failures" : [ ],
      "shards" : {
        "total" : 5,	--總分片
        "failed" : 0,
        "successful" : 5	--成功分片
      }
    }
  ]
}

恢復快照

快照恢復如果恢復到原索引中,需要先把原索引關閉或者先刪除后,在進行快照恢復

  • 恢復快照
POST _snapshot/my_hdfs_repository/snapshot_i_xfjbblxt_cxfw_xfj_d12/_restore
{
  "indices": "i_xfjbblxt_cxfw_xfj_d12"	--快照備份索引名稱
  ,"rename_pattern": "i_xfjbblxt_cxfw_xfj_d12"	--檢索匹配到的索引名稱
  , "rename_replacement": "restored_i_xfjbblxt_cxfw_xfj_d12"	--重命名索引
}
  • 狀態查看
{
  "restored_i_xfjbblxt_cxfw_xfj_d12" : {
    "shards" : [
      {
        "id" : 4,
        "type" : "SNAPSHOT",
        "stage" : "INDEX",
        "primary" : true,
        "start_time_in_millis" : 1602571287856,
        "total_time_in_millis" : 1249147,
        "source" : {
          "repository" : "my_hdfs_repository",
          "snapshot" : "snapshot_i_xfjbblxt_cxfw_xfj_d12",
          "version" : "7.2.0",
          "index" : "i_xfjbblxt_cxfw_xfj_d12",
          "restoreUUID" : "KM1EaKsAQkO4OxB0PwKe0Q"
        },
        "target" : {
          "id" : "DWvUrfqQRxGLIWm6SQmunA",
          "host" : "172.16.176.104",
          "transport_address" : "172.16.176.104:9300",
          "ip" : "172.16.176.104",
          "name" : "node-104"
        },
        "index" : {
          "size" : {
            "total_in_bytes" : 8312825377,
            "reused_in_bytes" : 0,
            "recovered_in_bytes" : 6781859331,
            "percent" : "81.6%"
          },
          "files" : {
            "total" : 104,
            "reused" : 0,
            "recovered" : 86,
            "percent" : "82.7%"
          },
          "total_time_in_millis" : 1249039,
          "source_throttle_time_in_millis" : 0,
          "target_throttle_time_in_millis" : 0
        },
        "translog" : {
          "recovered" : 0,
          "total" : 0,
          "percent" : "100.0%",
          "total_on_start" : 0,
          "total_time_in_millis" : 0
        },
        "verify_index" : {
          "check_index_time_in_millis" : 0,
          "total_time_in_millis" : 0
        }
      },
      --部分省略

備份恢復時間

案例快照詳情

第一次快照

節點數 主分片 副本分配 數據量 大小 快照大小 耗時(快照)
3 5 1 5149535 77.4gb 40gb 19.74195分鍾

案例快照恢復詳情

快照恢復過程為並行恢復

分片 耗時(恢復) 恢復字節
0(主) 27.42分鍾 7.75G
1(主) 27.14分鍾 7.72G
2(主) 27.45分鍾 7.75G
3(主) 25.89分鍾 7.74G
4(主) 25.5分鍾 7.74G
0(副) 18.65分鍾 7.75G
1(副) 10.3分鍾 7.72G
2(副) 17.21分鍾 7.75G
3(副) 10.6分鍾 7.74G
4(副) 18.32分鍾 7.74G

常見問題

啟動hdfs

問題1

$ start-dfs.sh 
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [host103]
Last login: Sun Oct 11 22:32:11 CST 2020 from 172.16.176.46 on pts/1
host103: ERROR: JAVA_HOME is not set and could not be found.
Starting datanodes
Last login: Sun Oct 11 22:32:23 CST 2020 on pts/1
localhost: ERROR: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [host103]
Last login: Sun Oct 11 22:32:24 CST 2020 on pts/1
host103: ERROR: JAVA_HOME is not set and could not be found.
  • 解決

配置java環境變量

export JAVA_HOME=/home/hadoop/jdk1.8.0_161
export CLASSPATH=$JAVA_HOME/libdt.jar:$JAVA_HOME/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

問題2

$ start-dfs.sh 
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [host103]
host103: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting datanodes
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting secondary namenodes [host103]
host103: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
  • 解決

hadoop用戶執行

[hadoop@host103 ~]$ ssh-copy-id hadoop@host103
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@host103's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'hadoop@host103'"
and check to make sure that only the key(s) you wanted were added.

創建倉庫

問題1

  • 創建
PUT _snapshot/my_hdfs_repository
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://172.16.176.103:9000/",
    "path": "/",
    "conf.dfs.client.read.shortcircuit": "false"
  }
}
  • 錯誤
error": {
    "root_cause": [
      {
        "type": "repository_exception",
        "reason": "[my_hdfs_repository] cannot create blob store"
      }
    ],
    "type": "repository_exception",
    "reason": "[my_hdfs_repository] cannot create blob store",
    "caused_by": {
      "type": "unchecked_i_o_exception",
      "reason": "Cannot create HDFS repository for uri [hdfs://172.16.176.103:9000/]",
      "caused_by": {
        "type": "access_control_exception",
        "reason": "Permission denied: user=es, access=WRITE, inode=\"/\":hadoop:supergroup:drwxr-xr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:360)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:239)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1909)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1893)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1852)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3407)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1161)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:739)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)\n",
  • 問題解決

新增hdfs-site.xml

<property>
  <name>dfs.permissions</name>
  <value>false</value>
</property>

參考文檔

  • HDFS插件

https://www.elastic.co/guide/en/elasticsearch/plugins/7.2/repository-hdfs.html

  • HDFS SingleCluster

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM