今天正在了解HBase和Hadoop,了解到HBase1.1.x為穩定版,對應的Hadoop2.5.x是最新的支持此版本HBase的,同時jdk版本為jdk7才能支持。--本段話內容參考自Apache官方文檔:
1.本表格為jdk與hbase版本對應:

-
"S" = supported
-
"X" = not supported
-
"NT" = Not tested

官方強烈建議安裝Hadoop2.x:
|
Hadoop 2.x is recommended.
Hadoop 2.x is faster and includes features, such as short-circuit reads, which will help improve your HBase random read profile. Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase 0.98 drops support for Hadoop 1.0, deprecates use of Hadoop 1.1+, and HBase 1.0 will not support Hadoop 1.x. |
本想把環境搭建起來,可是找不到機器,我找了一篇文章專門搭建和配置此環境的,先拿來貼在下面,等有機會自己搭一套。
以下詳細安裝配置的指導內容轉自:http://blog.csdn.net/yuansen1999/article/details/50542018
===================================以下全文:
版權聲明:本文為博主原創文章,未經博主允許不得轉載。
【說明】
hbase自1.0版本發布之后,標志着hbase可以投入企業的生產使用。此后又發布了1.x版本, 這里的1.1.2版本就是其中的一個穩定版本。
因為hbase對Hadoop的庫有依賴關系,對於hbase1.1.2要求hadoop的庫為2.5.1,所以使用hadoop2.5.1版本做為基本環境。如果使用其它
的hadoop版本, 還需要它lib下的jar文件替換成hadoop的版本,不然就會報本地庫找不到的錯誤, 下面是實際的安裝步驟。
1、 軟件安裝版本
| 組件名 |
版本 |
備注 |
| 操作系統 |
CentOS release 6.4 (Final) |
64位 |
| JDK |
jdk-7u80-linux-x64.gz |
|
| Hadoop |
hadoop-2.5. 1.tar.gz |
|
| ZooKeeper |
zookeeper-3.4.6.tar.gz |
|
| HBase |
hbase-1.1.2.tar.gz |
|
2、 主機規划
| IP |
HOST |
模塊部署 |
|
192.168.8.127 |
master |
QuorumPeerMain DataNode ResourceManager HRegionServer NodeManager SecondaryNameNode NameNode HMaster |
|
192.168.8.128 |
slave01 |
DataNode QuorumPeerMain HRegionServer NodeManager |
|
192.168.8.129 |
slave02 |
QuorumPeerMain HRegionServer NodeManager DataNode |
3、 目錄規划
| IP |
目錄 |
|
192.168.8.127 |
三個掛載點 根目錄: /dev/sda1 / swap目錄: tmpfs /dev/shm hadoop目錄: /dev/sda3 /hadoop |
|
192.168.8.128 |
三個掛載點 根目錄: /dev/sda1 / swap目錄: tmpfs /dev/shm hadoop目錄: /dev/sda3 /hadoop |
|
192.168.8.129 |
三個掛載點 根目錄: /dev/sda1 / swap目錄: tmpfs /dev/shm hadoop目錄: /dev/sda3 /hadoop |
[root@master~]# df -h
4、 為每台主機創建用戶hadoop並屬於hadoop組
3.1、創建工作組hadoop:
[root@localhost ~]# groupadd hadoop
3.2、新建用戶hadoop並添加至hadoop組別:
[root@localhost ~]# useradd hadoop -g hadoop
3.3、設置hadoop用戶密碼為hadoop:
[root@localhost ~]# passwd hadoop
5、 修改並配置主機名
[root@localhost ~]# vi /etc/hosts
127.0.0.1 localhost
192.168.8.127 master
192.168.8.128 slave01
192.168.8.129slave02
[root@localhost ~]# vi /etc/sysconfig/network
關機重啟:
[root@localhost ~]# reboot
查看主機名:
修改hadoop目錄的擁有者:
[root@master ~]# chown hadoop:hadoop -R /hadoop
[root@master ~]# ls -l /
6、 上傳安裝軟件包至hadoop用戶主目錄
7、 安裝JDK
6.1 安裝JDK
[root@master ~]# cd /usr/local/
[hadoop@master local]$ tar -zxvf jdk-7u80-linux-x64.gz
6.2 配置JDK環境變量
export JAVA_HOME= /usr/local/jdk1.7.0_80
export JRE_HOME= /usr/local/jdk1.7.0_80/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
6.3使環境變量生效
[hadoop@master~]$ source .bashrc
6.4檢測JDK是否安裝成功:
8、 配置各節點間SSH安全通信協議:
7.1、創建文件目錄:
[hadoop@master ~]$ mkdir .ssh
7.2、進入.ssh目錄進行相應配置:
[hadoop@master ~]$ cd .ssh/
7.3、生成公鑰文件:
[hadoop@master .ssh]$ ssh-keygen -t rsa
備注:一路回車即可
7.4、將生成的公鑰文件添加至認證文件:
[hadoop@master .ssh]$ cat id_rsa.pub >>authorized_keys
7.5、賦予.ssh文件700權限:
[hadoop@master .ssh]$ chmod 700 .ssh/
這個有的機器必須,但有的是可選。
7.5、賦予認證文件600權限:
[hadoop@master .ssh]$ chmod 600 authorized_keys
一定是600,不然不會成功。
7.6、測試SSH無密碼登錄:
[hadoop@master hadoop]$ ssh master
Last login: Tue Jan 19 13:58:27 2016 from 192.168.8.1
7.7、依次生成其他節點的SSH無密碼登錄(一樣套路)
7.8、將master節點節點的公鑰文件追加至其他節點(以master追加至slave01為例進行)
7.8.1、將master中的公鑰id_rsa.pub遠程拷貝至slave01節點的.ssh目錄下並重新命名為:master.pub
[hadoop@master .ssh]$ scp id_rsa.pub slave01:/home/hadoop/.ssh/master.pub
這個步驟,注意不要把人家的id_rsa.pub給覆蓋了。
7.8.2、切換至slave01節點,將master.pub追加至認證文件authorized_keys文件中
[hadoop@slave01 .ssh]$ cat master_rsa.pub >>authorized_keys
7.8.3、slave02與以上步驟相同
備注:第一次登錄時需要進行密碼輸入
9、 安裝Hadoop:
8.1、解壓安裝包:
[hadoop@master ~]$cd /hadoop
[hadoop@master hadoop]$tar -zxvf hadoop-2.5.1.tar.gz
8.2、配置Hadoop環境變量:
[hadoop@master ~]$vi .bashrc
export HADOOP_HOME=/hadoop/hadoop-2.5.1
export HADOOP_CONF_DIR=/hadoop/hadoop-2.5.1/etc/hadoop
exportPATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
8.3、使環境變量生效:
[hadoop@master ~]$source .bashrc
8.4、進入hadoop配置目錄按照以下表格進行配置:
備注:現將附件中的fairscheduler.xml文件copy至/hadoop/hadoop-2.5.1/
etc/hadoop中
[hadoop@master hadoop]$ pwd
/hadoop/hadoop-2.5.1/etc/hadoop
|
core-site.xml |
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:8020</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/tmp</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.yarn.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.yarn.groups</name> <value>*</value> </property> </configuration> |
|
hdfs-site.xml |
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop/hadoop/dfs/data</value> </property> </configuration> |
|
mapred-site.xml |
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> <property> <name>mapred.child.Java.opts</name> <value>-Xmx4096m</value> </property> </configuration> |
|
yarn-site.xml |
<configuration> <!-- Site specific YARN configuration properties --> <property> <description>The hostname of the RM.</description> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <description>The address of the applications manager interface in the RM.</description> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> </property> <property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>${yarn.resourcemanager.hostname}:8030</value> </property> <property> <description>The http address of the RM web application.</description> <name>yarn.resourcemanager.webapp.address</name> <value>${yarn.resourcemanager.hostname}:8088</value> </property> <property> <description>The https adddress of the RM web application.</description> <name>yarn.resourcemanager.webapp.https.address</name> <value>${yarn.resourcemanager.hostname}:8090</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <description>The address of the RM admin interface.</description> <name>yarn.resourcemanager.admin.address</name> <value>${yarn.resourcemanager.hostname}:8033</value> </property> <property> <description>The class to use as the resource scheduler.</description> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <description>fair-scheduler conf location</description> <name>yarn.scheduler.fair.allocation.file</name> <value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value> </property> <property> <description>List of directories to store localized files in. An application's localized file directory will be found in: ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.Individual containers' work directories, calledcontainer_${contid}, will be subdirectories of this. </description> <name>yarn.nodemanager.local-dirs</name> <value>/home/hadoop/hadoop/local</value> </property> <property> <description>Whether to enable log aggregation</description> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <description>Where to aggregate logs to.</description> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> </property> <property> <description>Amount of physical memory, in MB, that can be allocated for containers.</description> <name>yarn.nodemanager.resource.memory-mb</name> <value>30720</value> </property> <property> <description>Number of CPU cores that can be allocated for containers.</description> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>8</value> </property> <property> <description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> |
| slaves |
master slave01 slave02 |
| hadoop-env.sh |
export JAVA_HOME=/hadoop/jdk1.7.0_80 備注:最后一行進行添加 |
8.5、配置各葉子節點的環境:
8.5.1、在master端將hadoop-2.5.1、jdk1.7.0_80、環境變量文件.bashrc文件遠程拷貝至其他節點
8.5.2、在slave01、slave02節點執行使環境變量生效的命令:
[hadoop@slave01 ~]$ source.bashrc
8.6、進行格式化:
[hadoop@master hadoop]$hadoop namenode –format
8.7、啟動Hadoop:
[hadoop@master hadoop]$ start-all.sh
[hadoop@master hadoop]$ mr-jobhistory-daemon.shstart historyserver
8.8、查看啟動進程:
8.8.1、master節點進程:
[hadoop@master hadoop]$ jps
3456 Jps
2305 NameNode
3418 JobHistoryServer
2592 SecondaryNameNode
2844 NodeManager
2408 DataNode
2739 ResourceManager
8.8.2、slave01、slave02節點進程:
[hadoop@slave01~]$ jps
2567Jps
2249DataNode
2317NodeManager
[hadoop@slave02~]$ jps
2298NodeManager
2560Jps
2229DataNode
8.9、在各個節點關閉防火牆:
[root@master ~]# iptables -F
[root@master ~]# service iptables save
[root@master ~]# service iptables stop
[root@master ~]# chkconfig iptablesoff
有ip6tables的,也一樣
[root@master ~]# ip6tables -F
[root@master ~]# service ip6tables save
[root@master ~]# service ip6tablesstop
[root@master ~]# chkconfig ip6tablesoff
8.10、訪問Web頁面:
http://master:8088/cluster/cluster
10、 安裝ZooKeeper:
10.1、master端安裝:
10.1.1、解壓安裝包:
[hadoop@master ~]$cd /hadoop
[hadoop@master hadoop]$tar -zxvf zookeeper-3.4.6.tar.gz
10.1.2、配置環境變量:
[hadoop@master ~]$vi .bashrc
export ZOOKEEPER_HOME=/hadoop/zookeeper-3.4.6
exportPATH=.:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf:$PATH
10.1.3、使環境變量生效:
[hadoop@master ~]$source .bashrc
10.1.4、切換至ZooKeeper的配置文件目錄進行配置:
[hadoop@master ~]$ cd /hadoop/zookeeper-3.4.6/conf/
10.1.5、新建Zookeeper配置文件:
[hadoop@master conf]$ cpzoo_sample.cfg zoo.cfg
10.1.6、對zoo.cfg進行配置:
| 內容 |
備注 |
| dataDir=/hadoop/zookeeperdata |
1、 此為修改項 2、 hadoop為用戶名 |
| clientPort=2181 |
1、此為修改項 |
| server.1=master:2888:3888 server.2=slave01:2888:3888 server.3= slave02:2888:3888 |
1、此為新增項 |
10.1.7、在主目錄下進行一下操作:
[hadoop@master ~]$ cd /hadoop
[hadoop@master hadoop]$ mkdirzookeeperdata
[hadoop@master hadoop]$ echo"1" > /hadoop/zookeeperdata/myid
10.2、salve01端安裝:
10.2.1、將hadoop中zookeeper-3.4.6進行遠程復制到salve01的主目錄:
[hadoop@master hadoop]$ scp -r zookeeper-3.4.6slave01:/hadoop
10.2.2、將master中.bashrc文件遠程拷貝至datanode1中:
[hadoop@master ~]$ cd
[hadoop@master ~]$ scp.bashrc slave01:/home/hadoop
10.2.3、在slave01中使環境變量生效:
[hadoop@salve01~]$ source .bashrc
10.2.4、在slave01中進行如下操作:
[hadoop@slave01 hadoop]$ mkdir zookeeperdata
[hadoop@slave01 hadoop]$ echo"2" > /home/hadoop/zookeeperdata/myid
10.3、slave02端的安裝(忽略):
[hadoop@salve02~]$ source .bashrc
[hadoop@slave02 hadoop]$ mkdir zookeeperdata
[hadoop@slave02 hadoop]$ echo"3" > /hadoop/zookeeperdata/myid
10.4、啟動所有zookeeper服務:
[hadoop@master hadoop]$ zkServer.shstart
JMX enabled by default
Using config:/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@slave01 ~]$ zkServer.sh start
JMX enabled by default
Using config:/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@slave02 hadoop]$ zkServer.shstart
JMX enabled by default
Using config:/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
10.5、使用JPS查看進程:
[hadoop@master hadoop]$ jps
2305 NameNode
3608 Jps
3418 JobHistoryServer
2592 SecondaryNameNode
2844 NodeManager
2408 DataNode
2739 ResourceManager
3577 QuorumPeerMain
其中“QuorumPeerMain” 就是我們的zookeeper進程。
[hadoop@slave01 ~]$ jps
2249 DataNode
2662 Jps
2317 NodeManager
2616 QuorumPeerMain
[hadoop@slave02 hadoop]$ jps
2599 QuorumPeerMain
2298 NodeManager
2652 Jps
2229 DataNode
11、安裝HBASE:
11.1、配置NTP時間同步服務:
11.1.1、服務端(Master)配置:
[hadoop@masterhadoop]$ su - root
密碼:
[root@master ~]# vi/etc/ntp.conf
修改以下配置:
#restrictdefault kod nomodify notrap nopeer noquery
restrictdefault kod nomodify
restrict-6 default kod nomodify notrap nopeer noquery
修改完成之后,啟動ntpd.
[root@master ~]service ntpd start
[root@master ~]chkconfig ntpd on
11.1.2、客戶端配置:
[hadoop@slave01 ~]$su - root
密碼:
[root@slave01~]# crontab -e
輸入 以下命令:
0-59/10 * * * */usr/sbin/ntpdate 192.168.8.127 && /sbin/hwclock -w
我們每隔10分鍾與主機對一下時間。
11.2安裝HBASE
11.2.1、解壓縮hbase安裝包
[hadoop@master ~]$ cd /hadoop
[hadoop@master hadoop]$ tar -zxvf hbase-1.1.2-bin.tar.gz
11.2.2、配置環境變量:
[hadoop@master hadoop]$ vi ~/.bashrc
增加hbase的目錄:
export HBASE_HOME=/hadoop/hbase-1.1.2
exportPATH=.:$HBASE_HOME/bin:$HBASE_HOME/conf:$PATH
11.2.3、使環境變量生效:
[hadoop@master hadoop]$ source ~/.bashrc
11.2.4、切換至HBase的配置目錄:
[hadoop@master hadoop]$ cd /hadoop/hbase-1.1.2/conf
11.2.5、配置hbase-env.sh文件:
[hadoop@masterconf]$ vi hbase-env.sh
| 內容 |
備注 |
| export HBASE_MANAGES_ZK=false |
1、此為修改項; |
11.2.6、配置hbase-site.xml文件:
[hadoop@master conf]$ vihbase-site.xml
| <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://master:8020/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.master</name> <value>master</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master,slave01,slave02</value> </property> </configuration> |
11.2.7、配置regionservers文件:
[hadoop@master conf]$ vi regionservers
master
slave01
slave02
11.2.8、slave01與slave02配置:
同master配置
11.2.9、啟動Hbase(確保HADOOP和ZOOKEEPER已經啟動)
[hadoop@master conf]$ start-hbase.sh
11.2.10、使用JPS查看進程:
[hadoop@master hadoop]$ jps
2305 NameNode
3418 JobHistoryServer
2592 SecondaryNameNode
2844 NodeManager
2408 DataNode
2739 ResourceManager
3577 QuorumPeerMain
3840 HMaster
4201 Jps
3976 HRegionServer
11.2.11、進入HBASE命令行模式並進行相應查詢:
[hadoop@master hadoop]$ hbase shell
HBase Shell; enter'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" toleave the HBase Shell
Version 1.1.2,rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
hbase(main):005:0> list
TABLE
0 row(s) in 0.0270 seconds
=> []
我們創建一個表,看看是否成功:
hbase(main):006:0> create'test','info'
0 row(s) in 2.3150 seconds
=> Hbase::Table - test
hbase(main):007:0>
看來是成功了,添加一條數據,看看是否能夠保存。
hbase(main):008:0> put'test','u00001','info:username','yuansen'
0 row(s) in 0.1400 seconds
hbase(main):009:0> scan 'test'
ROW COLUMN+CELL
u00001 column=info:username,timestamp=1453186521452, value=yuansen
1 row(s) in 0.0550 seconds
看來的確是成功了。
