1、集群環境
Hadoop HA 集群規划
hadoop1 cluster1 nameNode HMaster
hadoop2 cluster1 nameNodeStandby ZooKeeper ResourceManager HMaster
hadoop3 cluster2 nameNode ZooKeeper
hadoop4 cluster2 nameNodeStandby ZooKeeper ResourceManagerStandBy
hadoop5 DataNode NodeManager HRegionServer
hadoop6 DataNode NodeManager HRegionServer
hadoop7 DataNode NodeManager HRegionServer
2、安裝
版本:版本為hbase-0.99.2-bin.tar
解壓配置環境變量
export HBASE_HOME=/home/hadoop/apps/hbase
export PATH=$PATH:$HBASE_HOME/bin
配置文件:在conf目錄中加入core-site.xml,hdfs-site.xml文件,該文件保持和hadoop的配置文件一致
3、詳細配置文件
core-site.xml 和當前hadoop集群的配置文件保持一致

1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 <property> 21 <name>fs.defaultFS</name> 22 <value>hdfs://cluster1</value> 23 </property> 24 <property> 25 <name>hadoop.tmp.dir</name> 26 <value>/home/hadoop/hadoop/tmp</value> 27 </property> 28 <property> 29 <name>ha.zookeeper.quorum</name> 30 <value>hadoop2:2181,hadoop3:2181,hadoop4:2181</value> 31 </property> 32 </configuration>
hdfs-site.xml和當前hadoop集群的配置文件保持一致

1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 21 <property> 22 <name>dfs.replication</name> 23 <value>3</value> 24 </property> 25 <!--指定DataNode存儲block的副本數量。默認值是3個,我們現在有4個DataNode,該值不大於4即可。--> 26 <property> 27 <name>dfs.nameservices</name> 28 <value>cluster1,cluster2</value> 29 </property> 30 <!--使用federation時,使用了2個HDFS集群。這里抽象出兩個NameService實際上就是給這2個HDFS集群起了個別名。名字可以隨便起,相互不重復即可 --> 31 32 <!-- 以下是集群cluster1的配置信息--> 33 <property> 34 <name>dfs.ha.namenodes.cluster1</name> 35 <value>nn1,nn2</value> 36 </property> 37 <!-- 指定NameService是cluster1時的namenode有哪些,這里的值也是邏輯名稱,名字隨便起,相互不重復即可 --> 38 <property> 39 <name>dfs.namenode.rpc-address.cluster1.nn1</name> 40 <value>hadoop1:9000</value> 41 </property> 42 <!-- 指定hadoop1的RPC地址 --> 43 <property> 44 <name>dfs.namenode.http-address.cluster1.nn1</name> 45 <value>hadoop1:50070</value> 46 </property> 47 <!-- 指定hadoop1的http地址 --> 48 <property> 49 <name>dfs.namenode.rpc-address.cluster1.nn2</name> 50 <value>hadoop2:9000</value> 51 </property> 52 <!-- 指定hadoop2的RPC地址 --> 53 <property> 54 <name>dfs.namenode.http-address.cluster1.nn2</name> 55 <value>hadoop2:50070</value> 56 </property> 57 <!-- 指定hadoop2的http地址 --> 58 <property> 59 <name>dfs.namenode.shared.edits.dir</name> 60 <value>qjournal://hadoop2:8485;hadoop3:8485;hadoop4:8485/cluster1</value> 61 </property> 62 <!-- 指定cluster1的兩個NameNode共享edits文件目錄時,使用的JournalNode集群信息,在cluster1中配置此信息 --> 63 <property> 64 <name>dfs.ha.automatic-failover.enabled.cluster1</name> 65 <value>true</value> 66 </property> 67 <!-- 指定cluster1是否啟動自動故障恢復,即當NameNode出故障時,是否自動切換到另一台NameNode --> 68 <property> 69 <name>dfs.client.failover.proxy.provider.cluster1</name> 70 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 71 </property> 72 <!-- 指定cluster1出故障時,哪個實現類負責執行故障切換 --> 73 74 <!-- 以下是集群cluster2的配置信息--> 75 <property> 76 <name>dfs.ha.namenodes.cluster2</name> 77 <value>nn3,nn4</value> 78 </property> 79 <!-- 指定NameService是cluster2時的namenode有哪些,這里的值也是邏輯名稱,名字隨便起,相互不重復即可 --> 80 <property> 81 <name>dfs.namenode.rpc-address.cluster2.nn3</name> 82 <value>hadoop3:9000</value> 83 </property> 84 <!-- 指定hadoop3的RPC地址 --> 85 <property> 86 <name>dfs.namenode.http-address.cluster2.nn3</name> 87 <value>hadoop3:50070</value> 88 </property> 89 <!-- 指定hadoop3的http地址 --> 90 <property> 91 <name>dfs.namenode.rpc-address.cluster2.nn4</name> 92 <value>hadoop4:9000</value> 93 </property> 94 <!-- 指定hadoop4的RPC地址 --> 95 <property> 96 <name>dfs.namenode.http-address.cluster2.nn4</name> 97 <value>hadoop4:50070</value> 98 </property> 99 <!-- 指定hadoop4的http地址 --> 100 <!-- 101 <property> 102 <name>dfs.namenode.shared.edits.dir</name> 103 <value>qjournal://hadoop2:8485;hadoop3:8485;hadoop4:8485/cluster2</value> 104 </property> 105 --> 106 <!-- 指定cluster2的兩個NameNode共享edits文件目錄時,使用的JournalNode集群信息,在cluster2中配置此信息 --> 107 <property> 108 <name>dfs.ha.automatic-failover.enabled.cluster2</name> 109 <value>true</value> 110 </property> 111 <!-- 指定cluster2是否啟動自動故障恢復,即當NameNode出故障時,是否自動切換到另一台NameNode --> 112 <property> 113 <name>dfs.client.failover.proxy.provider.cluster2</name> 114 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 115 </property> 116 <!-- 指定cluster2出故障時,哪個實現類負責執行故障切換 --> 117 <property> 118 <name>dfs.journalnode.edits.dir</name> 119 <value>/home/hadoop/hadoop/jndir</value> 120 </property> 121 <!-- 指定JournalNode集群在對NameNode的目錄進行共享時,自己存儲數據的磁盤路徑 --> 122 <property> 123 <name>dfs.datanode.data.dir</name> 124 <value>/home/hadoop/hadoop/datadir</value> 125 </property> 126 <property> 127 <name>dfs.namenode.name.dir</name> 128 <value>/home/hadoop/hadoop/namedir</value> 129 </property> 130 <property> 131 <name>dfs.ha.fencing.methods</name> 132 <value>sshfence</value> 133 </property> 134 <!-- 一旦需要NameNode切換,使用ssh方式進行操作 --> 135 <property> 136 <name>dfs.ha.fencing.ssh.private-key-files</name> 137 <value>/home/hadoop/.ssh/id_rsa</value> 138 </property> 139 <!-- 一旦需要NameNode切換,使用ssh方式進行操作 --> 140 </configuration>
hbase-env.sh
#
#/**
# * Copyright 2007 The Apache Software Foundation
# *
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements. See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership. The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License. You may obtain a copy of the License at
# *
# * http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */
# Set environment variables here.
# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)
# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/local/jdk1.8.0_151
# Extra Java CLASSPATH elements. Optional.這行代碼是錯的,需要可以修改為下面的形式
#export HBASE_CLASSPATH=/home/hadoop/hbase/conf
export JAVA_CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
# The maximum amount of heap to use, in MB. Default is 1000.
# export HBASE_HEAPSIZE=1000
# Extra Java runtime options.
# Below are what we set by default. May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
# Uncomment below to enable java garbage collection logging for the server-side processes
# this enables basic gc logging for the server processes to the .out file
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS"
# this enables gc logging using automatic GC log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. Either use this set of options or the one above
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M $HBASE_GC_OPTS"
# Uncomment below to enable java garbage collection logging for the client processes in the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS"
# Uncomment below (along with above GC logging) to put GC information in its own logfile (will set HBASE_GC_OPTS).
# This applies to both the server and client GC options above
# export HBASE_USE_GC_LOGFILE=true
# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.
# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
# File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
# Extra ssh options. Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
# Where log files are stored. $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs
# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER
# The scheduling priority for daemon processes. See 'man nice'.
# export HBASE_NICENESS=10
# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids
# Seconds to sleep between slave commands. Unset by default. This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
hbase-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- /** * Copyright 2010 The Apache Software Foundation * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <!--指定master--> <property> <name>hbase.master</name> <value>hadoop1:60000</value> </property> <property> <name>hbase.master.maxclockskew</name> <value>180000</value> </property> <!--指定hbase的路徑,地址根據hdfs-site.xml的配置而定,當前是hadoop集群1(cluster1)的路徑--> <property> <name>hbase.rootdir</name> <value>hdfs://cluster1/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <!--zoojeeper集群--> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop2,hadoop3,hadoop4</value> </property> <!--zookeeper的data目錄--> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hadoop/zookeeper/data</value> </property> </configuration>
regionservers文件指定 HRegionServer所在的機器
hadoop5
hadoop6
hadoop7
在一台機器上配置好hbase之后,然后使用scp命令復制到其他機器上(hadoop1,hadoop2,hadoop5,hadoop6,hadoop7)
4 啟動hbase集群(前提是hadoop集群已經啟動)
4.1 在hadoop1 上執行
start-hbase.sh
此時在hadoop1上會啟動 Hmaster 在hadoop5,hadoop6,hadoop7 上會啟動HRegionServer。
通過http://hadoop1:16030可以訪問web頁面如下:
4.2 啟動備用master
在hadoop2 上執行:
local-master-backup.sh start 1
通過http://hadoop2:16031訪問備用master
local-master-backup.sh start n :
該命令說明:n代表啟動第幾個備用master,對應的web訪問地址的端口號也會變成1603n。