大數據運維(37) Hadoop+Hive+HBase+Kylin +spark+Flink偽分布式安裝


問題導讀

1.Centos7如何安裝配置?
2.linux網絡配置如何進行
3.linux環境下java如何安裝
4.linux環境下SSH免密碼登錄如何配置
5.linux環境下Hadoop2.7如何安裝
6.linux環境下Mysql如何安裝
7.linux環境下Hive如何安裝
8.linux環境下Zookeeper如何安裝
9.linux環境下Kafka如何安裝
10.linux環境下Hbase如何安裝?
11.linux環境下KYLIN如何安裝?
12.linux環境下scala如何安裝?
13.linux環境下spark如何安裝?



最近學習Kylin,肯定需要一個已經安裝好的環境,Kylin的依賴環境官方介紹如下:
依賴於 Hadoop 集群處理大量的數據集。您需要准備一個配置好 HDFS,YARN,MapReduce,,Hive, HBase,Zookeeper 和其他服務的 Hadoop 集群供 Kylin 運行。Kylin 可以在 Hadoop 集群的任意節點上啟動。方便起見,您可以在 master 節點上運行 Kylin。但為了更好的穩定性,我們建議您將 Kylin 部署在一個干凈的 Hadoop client 節點上,該節點上 Hive,HBase,HDFS 等命令行已安裝好且 client 配置(如 core-site.xml,hive-site.xml,hbase-site.xml及其他)也已經合理的配置且其可以自動和其它節點同步。運行 Kylin 的 Linux 賬戶要有訪問 Hadoop 集群的權限,包括創建/寫入 HDFS 文件夾,Hive 表, HBase 表和提交 MapReduce 任務的權限。

軟件要求
Hadoop: 2.7+, 3.1+ (since v2.5)
Hive: 0.13 - 1.2.1+
HBase: 1.1+, 2.0 (since v2.5)
Spark (可選) 2.3.0+
Kafka (可選) 1.0.0+ (since v2.5)
JDK: 1.8+ (since v2.5)
OS: Linux only, CentOS 6.5+ or Ubuntu 16.0.4+

安裝要求知道了,但是hadoop這些東西不太熟悉,小白一個,看了網上一些資料邊看邊學邊做,期間遇到了很多坑!很多人寫的安裝部署文檔要么是步驟東一塊西一塊,要么是省略,扔個連接或則說讓自己去百度。在經歷了很多坑之后終於是把完全分布式的hadoop+mysql+hive+hbase+zookeeper+kylin部署成功了,但是對於日常自己學習測試來說,開多台虛擬機電腦實在撐不住,於是寫了現在這個偽分布式的部署文檔給像我一樣初學kylin的小白同學們

環境配置:

目前有兩個測試環境,以Centsos 7系統的安裝為例子介紹詳細過程,Centos7系統規划配置清單如下,另外一個測試環境為RedHat 6 64位系統,安裝過程都差不多,Mysql安裝有些不一樣,不一樣的地方都分別寫了各自的安裝方法,安裝過程中遇到的坑很多
並且都已經解決,不再一一列舉,按照下面步驟是完全可以在Centos 7/Redhat 6 64位系統安裝成功的。



一、Centos7安裝

打開vmware,創建新虛擬機安裝Centos 774位系統:















完成后界面如下:



選擇啟動虛擬機,選擇第一個選項回車:


選擇繼續


等待依賴包檢查完成,點擊date&time設置時間



接下來點擊software selection選擇安裝模式,這里選擇最精簡安裝:



然后點擊done出來之后,等待依賴包檢查完成,然后設置磁盤分區


選擇現在設置:


點擊done后,進入下面所示界面,選擇標准分區,然后設置點擊+號設置分區




最后分好區如下:


然后點擊done后點擊確認



接下來選擇網絡設置



設置hostname,點擊apply。然后選擇configure設置網絡ip





最后done點擊安裝就可以了:



可以在這個界面設置下root密碼,等待安裝完成就可以了。這是虛擬機的安裝,接下來配置linux,安裝軟件。

1、linux網絡配置:

(1)因為Centos 7安裝的精簡模式,先解決linux網絡問題來讓windows能夠用xshell連上,編輯/etc/sysconfig/network-scripts/ifcfg-ens33內容如下:

TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
UUID=e8df3ff3-cf86-42cd-b48a-0d43fe85d8a6
DEVICE=ens33
ONBOOT="yes"
IPADDR=192.168.1.66
PREFIX=24
IPV6_PRIVACY=no

(2)重啟網絡

[root@hadoop ~]# service network restart
Restarting network (via systemctl):                        [  OK  ]
重啟后可以通過下面命令來檢查網絡
[root@hadoop ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:0d:f1:ca brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.66/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::d458:8497:adb:7f01/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

(3)接下來關閉防火牆

[root@hadoop ~]# systemctl disable firewalld
[root@hadoop ~]# systemctl stop firewalld```

(4)進程守護,關閉selinux

[root@hadoop ~]# setenforce 0
[root@hadoop ~]# vi /etc/selinux/config
[root@hadoop ~]# cat  /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

重啟

[root@hadoop ~]# reboot
可以通過下面方式查看是否啟用selinux
sestatus
getenforce

(5)編輯/etc/hosts加入下面內容

[root@hadoop ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.66 hadoop

2、安裝java

(1)先看下當前linux環境是否有自帶的open jdk:

[root@hadoop ~]# rpm -qa | grep java
[root@hadoop ~]# rpm -qa | grep jdk
[root@hadoop ~]# rpm -qa | grep gcj

沒有,如果有的話要卸載,卸載案例如下:
卸載linux自帶open jdk,將前面三條命令檢查出來的內容一一卸載:

[root@master ~]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.99-2.6.5.1.0.1.el6.x86_64
[root@master ~]# rpm -e --nodeps tzdata-java-2016c-1.el6.noarch
[root@master ~]# rpm -e java-1.6.0-openjdk-1.6.0.38-1.13.10.4.el6.x86_64
[root@master ~]# rpm -e java-1.7.0-openjdk-1.7.0.99-2.6.5.1.0.1.el6.x86_64

卸載完成后應該再檢查一次

(2)接下來安裝配置java

創建安裝目錄:

[root@hadoop ~]# mkdir -p /usr/java

上傳並解壓jdk到此目錄

[root@hadoop ~]# cd /usr/java/
[root@hadoop java]# ls
jdk-8u151-linux-x64 (1).tar.gz

解壓縮

[root@hadoop java]# tar -zxvf jdk-8u151-linux-x64\ \(1\).tar.gz
[root@hadoop java]# rm -rf jdk-8u151-linux-x64\ \(1\).tar.gz
[root@hadoop java]# ls
jdk1.8.0_151

編輯/etc/profile
寫入下面jdk環境變量,保存退出

export JAVA_HOME=/usr/java/jdk1.8.0_151
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

使環境變量生效

[root@master java]# source /etc/profile

檢查安裝是否沒問題

[root@hadoop java]# java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)

3、配置SSH免密碼登錄

(1)輸入命令,ssh-keygen -t rsa,生成key,都不輸入密碼,一直回車,/root就會生成.ssh文件夾

[root@hadoop ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:+Xxqh8qa2AguQPY4aNJci6YiUWS822NtcLRK/9Kopp8 root@hadoop1
The key's randomart image is:
+---[RSA 2048]----+
| . |
| + . |
| o . . . |
| oo + o . |
|++o* B S |
|=+*.* + o |
|++o. o + o.. |
|=. ..=ooo oo. |
|o.o+E.+ooo.. |
+----[SHA256]-----+     
[root@hadoop ~]# cd .ssh/
[root@hadoop .ssh]# ls
id_rsa id_rsa.pub known_hosts


合並公鑰到authorized_keys文件,在hadoop服務器,進入/root/.ssh目錄,通過SSH命令合並

[root@hadoop .ssh]# cat id_rsa.pub>> authorized_keys

通過下面命令測試

ssh localhost
ssh hadoop
ssh 192.168.1.66

4、安裝Hadoop2.7

(1)下載連接:
http://archive.apache.org/dist/hadoop/core/hadoop-2.7.6/



(2)解壓:

 

[root@hadoop ~]# cd /hadoop/
[root@hadoop hadoop]# ls
hadoop-2.7.6 (1).tar.gz
[root@hadoop hadoop]# tar -zxvf hadoop-2.7.6\ \(1\).tar.gz ^C
[root@hadoop hadoop]# ls
hadoop-2.7.6  hadoop-2.7.6 (1).tar.gz
[root@hadoop hadoop]# rm -rf *gz
[root@hadoop hadoop]# mv hadoop-2.7.6/* .

 

(3)在/hadoop目錄下創建數據存放的文件夾,tmp、hdfs、hdfs/data、hdfs/name

[root@hadoop hadoop]# pwd
/hadoop
[root@hadoop hadoop]# mkdir tmp
[root@hadoop hadoop]# mkdir hdfs
[root@hadoop hadoop]# mkdir hdfs/data
[root@hadoop hadoop]# mkdir hdfs/name

(4)配置/hadoop/etc/hadoop目錄下的core-site.xml

[root@hadoop hadoop]# vi etc/hadoop/core-site.xml
<configuration>
  <property>
        <name>fs.defaultFS</name>
        <value>hdfs://192.168.1.66:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/hadoop/tmp</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131702</value>
    </property>
</configuration>

(5)配置/hadoop/etc/hadoop/hdfs-site.xm

[root@hadoop hadoop]# vi etc/hadoop/hdfs-site.xml
<configuration>
<property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/hadoop/hdfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/hadoop/hdfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>192.168.1.66:9001</value>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>
</configuration>

(6)復制etc/hadoop/mapred-site.xml.template為etc/hadoop/mapred-site.xml,再編輯:

[root@hadoop hadoop]# cd etc/hadoop/
[root@hadoop hadoop]# cp mapred-site.xml.template  mapred-site.xml
[root@hadoop hadoop]# vi mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>192.168.1.66:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>192.168.1.66:19888</value>
    </property>

</configuration>

(7)配置 etc/hadoop/yarn-site.xml

[root@hadoop1 hadoop]# vi yarn-site.xml
<configuration>


<!-- Site specific YARN configuration properties -->
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
<property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop</value>
  </property>

</configuration>

(8)配置/hadoop/etc/hadoop/目錄下hadoop-env.sh、yarn-env.sh的JAVA_HOME,不設置的話,啟動不了

[root@hadoop hadoop]# pwd
/hadoop/etc/hadoop
[root@hadoop hadoop]# vi hadoop-env.sh
將 export JAVA_HOME 改為:export JAVA_HOME=/usr/java/jdk1.8.0_151
加入
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

[root@hadoop hadoop]# vi yarn-env.sh
將 export JAVA_HOME 改為:export JAVA_HOME=/usr/java/jdk1.8.0_151

配置slaves文件

[root@hadoop hadoop]# cat slaves
localhost

(9)配置hadoop環境變量

[root@hadoop ~]# vi /etc/profile  
寫入下面內容
export HADOOP_HOME=/hadoop/
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

[root@hadoop ~]# source /etc/profile

(10)啟動hadoop

[root@hadoop hadoop]# pwd
/hadoop
[root@hadoop hadoop]#  bin/hdfs namenode -format
.。。。。。。。。。。。。。。。。。。。。。
19/03/04 17:18:00 INFO namenode.FSImage: Allocated new BlockPoolId: BP-774693564-192.168.1.66-1551691079972
19/03/04 17:18:00 INFO common.Storage: Storage directory /hadoop/hdfs/name has been successfully formatted.
19/03/04 17:18:00 INFO namenode.FSImageFormatProtobuf: Saving image file /hadoop/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
19/03/04 17:18:00 INFO namenode.FSImageFormatProtobuf: Image file /hadoop/hdfs/name/current/fsimage.ckpt_0000000000000000000 of size 321 bytes saved in 0 seconds.
19/03/04 17:18:00 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
19/03/04 17:18:00 INFO util.ExitUtil: Exiting with status 0
19/03/04 17:18:00 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop/192.168.1.66
************************************************************/

全部啟動sbin/start-all.sh,也可以分開sbin/start-dfs.sh、sbin/start-yarn.sh

[root@hadoop hadoop]# sbin/start-dfs.sh
[root@hadoop hadoop]# sbin/start-yarn.sh

停止的話,輸入命令,sbin/stop-all.sh
輸入命令jps,可以看到相關信息:

[root@hadoop hadoop]# jps
10581 ResourceManager
10102 NameNode
10376 SecondaryNameNode
10201 DataNode
10683 NodeManager
11007 Jps

(11)啟動jobhistory

mr-jobhistory-daemon.sh start historyserver
[root@hadoop hadoop]# jps
33376 NameNode
33857 ResourceManager
33506 DataNode
33682 SecondaryNameNode
33960 NodeManager
34319 JobHistoryServer
34367 Jps

(12)驗證

1)瀏覽器打開http://192.168.1.66:8088/
2)瀏覽器打開http://192.168.1.66:50070/

5、安裝Mysql

需要根據自己的系統版本去下載,下載連接:
https://dev.mysql.com/downloads/mysql/5.7.html#downloads
我這里下載的是適用我當前本人測試環境Centos 7 64位 的系統,而另一個測試環境10.1.197.241是Redhat 6,兩個測試環境如果安裝時要下載對應系統的rpm包,不然不兼容的rpm包安裝時會報下面的錯誤(比如在Redhat6安裝適用centos7的mysql):

[root@s197240 hadoop]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
error: Failed dependencies:
libc.so.6(GLIBC_2.14)(64bit) is needed by mysql-community-libs-5.7.18-1.el7.x86_64

1)檢查卸載mariadb-lib
Centos自帶mariadb數據庫,刪除,安裝mysql

[root@hadoop hadoop]# rpm -qa|grep mariadb
mariadb-libs-5.5.60-1.el7_5.x86_64
[root@hadoop hadoop]# rpm -e mariadb-libs-5.5.60-1.el7_5.x86_64 --nodeps
[root@hadoop hadoop]# rpm -qa|grep mariadb

如果時Redhat6安裝時自帶mysql庫,卸載自帶的包:
通過此命令查找已經安裝的mysql包:

[root@s197240 hadoop]# rpm -qa |grep mysql
mysql-community-common-5.7.18-1.el7.x86_64

通過此命令卸載:

[root@s197240 hadoop]# rpm -e --allmatches --nodeps mysql-community-common-5.7.18-1.el7.x86_64

2)上傳解壓安裝包
下載連接:
https://dev.mysql.com/downloads/file/?id=469456

[root@hadoop mysql]# pwd
/usr/local/mysql
[root@hadoop mysql]# ls
mysql-5.7.18-1.el7.x86_64.rpm-bundle.tar
[root@hadoop mysql]# tar -xvf  mysql-5.7.18-1.el7.x86_64.rpm-bundle.tar
mysql-community-server-5.7.18-1.el7.x86_64.rpm
mysql-community-embedded-devel-5.7.18-1.el7.x86_64.rpm
mysql-community-devel-5.7.18-1.el7.x86_64.rpm
mysql-community-client-5.7.18-1.el7.x86_64.rpm
mysql-community-common-5.7.18-1.el7.x86_64.rpm
mysql-community-embedded-5.7.18-1.el7.x86_64.rpm
mysql-community-embedded-compat-5.7.18-1.el7.x86_64.rpm
mysql-community-libs-5.7.18-1.el7.x86_64.rpm
mysql-community-server-minimal-5.7.18-1.el7.x86_64.rpm
mysql-community-test-5.7.18-1.el7.x86_64.rpm
mysql-community-minimal-debuginfo-5.7.18-1.el7.x86_64.rpm
mysql-community-libs-compat-5.7.18-1.el7.x86_64.rpm

(3)安裝mysql server
其中,安裝mysql-server, 需要以下幾個必要的安裝包:

mysql-community-client-5.7.17-1.el7.x86_64.rpm(依賴於libs)
mysql-community-common-5.7.17-1.el7.x86_64.rpm (依賴於common)
mysql-community-libs-5.7.17-1.el7.x86_64.rpm
mysql-community-server-5.7.17-1.el7.x86_64.rpm(依賴於common, client)

安裝上面四個包需要libaio和net-tools的依賴,這里配置好yum源,使用yum安裝,通過以下命令安裝:

yum -y install libaio
yum -y install net-tools

安裝mysql-server:按照common–>libs–>client–>server的順序。若不按照此順序,也會有一定“依賴”關系的提醒。

[root@hadoop mysql]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-common-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-common-5.7.18-1.e################################# [100%]
[root@hadoop mysql]# rpm -ivh  mysql-community-libs-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-libs-5.7.18-1.el7################################# [100%]
[root@hadoop mysql]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-client-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-client-5.7.18-1.e################################# [100%]
[root@hadoop mysql]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-server-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-server-5.7.18-1.e################################# [100%]

(4)初始化mysql

[root@hadoop mysql]#  mysqld --initialize

mysql默認安裝在/var/lib下。

(5)更改mysql數據庫所屬於用戶及其所屬於組

[root@hadoop mysql]# chown mysql:mysql /var/lib/mysql -R

(6)啟動mysql數據庫

[root@hadoop mysql]# cd /var/lib/mysql
[root@hadoop mysql]# systemctl start mysqld.service
[root@hadoop ~]# cd /var/log/
[root@hadoop log]# grep 'password' mysqld.log
2019-02-26T04:33:06.989818Z 1 [Note] A temporary password is generated for root@localhost: mxeV&htW-3VC

更改root用戶密碼,新版的mysql在第一次登錄后更改密碼前是不能執行任何命令的

[root@hadoop log]# mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.18

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

更改密碼

mysql> set password=password('oracle');
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> grant all privileges on *.* to root@'%' identified by 'oracle' with grant option;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

如果是Redhat6系統,啟動mysql數據庫過程如下:

[root@s197240 mysql]# /etc/rc.d/init.d/mysqld start
Starting mysqld:                                           [  OK  ]
[root@s197240 mysql]# ls /etc/rc.d/init.d/mysqld -l
-rwxr-xr-x 1 root root 7157 Dec 21 19:29 /etc/rc.d/init.d/mysqld
[root@s197240 mysql]# chkconfig mysqld on
[root@s197240 mysql]# chmod 755 /etc/rc.d/init.d/mysqld
[root@s197240 mysql]# service mysqld start
Starting mysqld:                                           [  OK  ]
[root@s197240 mysql]# service mysqld status
mysqld (pid  28861) is running...

mysql啟動后,剩余后面的操作完全按照上面systemctl start mysqld.service步驟下面的過程來就可以了

6、Hive安裝

下載連接:
http://archive.apache.org/dist/hive/hive-2.3.2/
(1)上載和解壓縮

[root@hadoop ~]# mkdir /hadoop/hive
[root@hadoop ~]# cd /hadoop/hive/
[root@hadoop hive]# ls
apache-hive-2.3.3-bin.tar.gz
[root@hadoop hive]# tar -zxvf apache-hive-2.3.3-bin.tar.gz

(2)配置環境變量

[root@hadoop hive]# vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_151
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/hadoop/
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export HIVE_HOME=/hadoop/hive/
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin

#修改完文件后,執行如下命令,讓配置生效:

[root@hadoop hive]# source /etc/profile

(3)Hive配置Hadoop HDFS
hive-site.xml配置
進入目錄$HIVE_HOME/conf,將hive-default.xml.template文件復制一份並改名為hive-site.xml

[root@hadoop hive]# cd $HIVE_HOME/conf
[root@hadoop conf]# cp hive-default.xml.template hive-site.xml

使用hadoop新建hdfs目錄,因為在hive-site.xml中有如下配置:

<property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>
  <property>

執行hadoop命令新建/user/hive/warehouse目錄:

[root@hadoop1 ~]# $HADOOP_HOME/bin/hadoop dfs -mkdir -p /user/hive/warehouse
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

#給新建的目錄賦予讀寫權限

[root@hadoop1 ~]# cd $HIVE_HOME
[root@hadoop1 hive]# cd conf/
[root@hadoop1 conf]# sh $HADOOP_HOME/bin/hdfs dfs -chmod 777 /user/hive/warehouse
#查看修改后的權限
[root@hadoop1 conf]# sh $HADOOP_HOME/bin/hdfs dfs -ls /user/hive Found 1 items drwxrwxrwx - root supergroup 0 2019-02-26 14:15 /user/hive/warehouse #運用hadoop命令新建/tmp/hive目錄 [root@hadoop1 conf]# $HADOOP_HOME/bin/hdfs dfs -mkdir -p /tmp/hive #給目錄/tmp/hive賦予讀寫權限 [root@hadoop1 conf]# $HADOOP_HOME/bin/hdfs dfs -chmod 777 /tmp/hive #檢查創建好的目錄 [root@hadoop1 conf]# $HADOOP_HOME/bin/hdfs dfs -ls /tmp Found 1 items drwxrwxrwx - root supergroup 0 2019-02-26 14:17 /tmp/hive

 

將hive_site.xml文件中的{system:java.io.tmpdir}替換為hive的臨時目錄,例如我替換為$HIVE_HOME/tmp,該目錄如果不存在則要自己手工創建,並且賦予讀寫權限。

[root@hadoop1 conf]# cd $HIVE_HOME
[root@hadoop1 hive]# mkdir tmp

配置文件hive-site.xml:
將文件中的所有 system:java.io.tmpdir替換成/hadoop/hive/tmp將文件中所有的 {system:java.io.tmpdir}替換成/hadoop/hive/tmp將文件中所有的system:java.io.tmpdir替換成/hadoop/hive/tmp將文件中所有的{system:user.name}替換為root

(4)配置mysql
把mysql的驅動包上傳到Hive的lib目錄下:

[root@hadoop lib]# pwd
/usr/local/hive/lib
[root@hadoop1 lib]# ls |grep mysql
mysql-connector-java-5.1.47.jar

(5)修改hive-site.xml數據庫相關配置
搜索javax.jdo.option.connectionURL,將該name對應的value修改為MySQL的地址:

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
  </property>

搜索javax.jdo.option.ConnectionDriverName,將該name對應的value修改為MySQL驅動類路徑:

<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>

搜索javax.jdo.option.ConnectionUserName,將對應的value修改為MySQL數據庫登錄名:

<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
  </property>

搜索javax.jdo.option.ConnectionPassword,將對應的value修改為MySQL數據庫的登錄密碼:

<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>oracle</value>
    <description>password to use against metastore database</description>
  </property>

搜索hive.metastore.schema.verification,將對應的value修改為false:

<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>

在$HIVE_HOME/conf目錄下新建hive-env.sh

[root@hadoop1 conf]# cd $HIVE_HOME/conf
[root@hadoop1 conf]# cp hive-env.sh.template hive-env.sh
#打開hive-env.sh並添加如下內容
[root@hadoop1 conf]# vim hive-env.sh
export HADOOP_HOME=/hadoop/
export HIVE_CONF_DIR=/hadoop/hive/conf
export HIVE_AUX_JARS_PATH=/hadoop/hive/lib

(6)MySQL數據庫進行初始化

[root@apollo conf]# cd $HIVE_HOME/bin
#對數據庫進行初始化:
[root@hadoop1 bin]# schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/imp
l/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/
org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:         jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotEx
ist=true&characterEncoding=UTF-8&useSSL=falseMetastore Connection Driver :         com.mysql.jdbc.Driver
Metastore connection User:         root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed

出現上面就是初始化成功,去mysql看下:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| metastore          |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

mysql> use metastore
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+---------------------------+
| Tables_in_metastore       |
+---------------------------+
| AUX_TABLE                 |
| BUCKETING_COLS            |
| CDS                       |
| COLUMNS_V2                |
| COMPACTION_QUEUE          |
| COMPLETED_COMPACTIONS     |
| COMPLETED_TXN_COMPONENTS  |
| DATABASE_PARAMS           |
| DBS                       |
| DB_PRIVS                  |
| DELEGATION_TOKENS         |
| FUNCS                     |
| FUNC_RU                   |
| GLOBAL_PRIVS              |
| HIVE_LOCKS                |
| IDXS                      |
| INDEX_PARAMS              |
| KEY_CONSTRAINTS           |
| MASTER_KEYS               |
| NEXT_COMPACTION_QUEUE_ID  |
| NEXT_LOCK_ID              |
| NEXT_TXN_ID               |
| NOTIFICATION_LOG          |
| NOTIFICATION_SEQUENCE     |
| NUCLEUS_TABLES            |
| PARTITIONS                |
| PARTITION_EVENTS          |
| PARTITION_KEYS            |
| PARTITION_KEY_VALS        |
| PARTITION_PARAMS          |
| PART_COL_PRIVS            |
| PART_COL_STATS            |
| PART_PRIVS                |
| ROLES                     |
| ROLE_MAP                  |
| SDS                       |
| SD_PARAMS                 |
| SEQUENCE_TABLE            |
| SERDES                    |
| SERDE_PARAMS              |
| SKEWED_COL_NAMES          |
| SKEWED_COL_VALUE_LOC_MAP  |
| SKEWED_STRING_LIST        |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES             |
| SORT_COLS                 |
| TABLE_PARAMS              |
| TAB_COL_STATS             |
| TBLS                      |
| TBL_COL_PRIVS             |
| TBL_PRIVS                 |
| TXNS                      |
| TXN_COMPONENTS            |
| TYPES                     |
| TYPE_FIELDS               |
| VERSION                   |
| WRITE_SET                 |
+---------------------------+
57 rows in set (0.01 sec)

(7)啟動hive:

啟動metastore服務

nohup hive --service metastore >> ~/metastore.log 2>&1 &

啟動hiveserver2,jdbc連接均需要

nohup  hive --service hiveserver2 >> ~/hiveserver2.log 2>&1 &

檢測hive和hive2端口

[root@hadoop bin]# netstat  -lnp|grep 9083
tcp        0      0 0.0.0.0:9083            0.0.0.0:*               LISTEN      11918/java
[root@hadoop bin]# netstat  -lnp|grep 10000
tcp        0      0 0.0.0.0:10000           0.0.0.0:*               LISTEN      12011/java  

測試hive

[root@hadoop1 bin]# ./hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/java/jdk1.8.0_151
/bin:/usr/java/jdk1.8.0_151/bin:/hadoop//bin:/hadoop//sbin:/root/bin:/usr/java/jdk1.8.0_151/bin:/usr/java/jdk1.8.0_151/bin:/hadoop//bin:/hadoop//sbin:/hadoop/hive/bin)SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/imp
l/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/
org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/hadoop/hive/lib/hive-common-2.3.3.jar!/
hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider
using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show functions;
OK
!
!=
$sum0
%
。。。。。
hive> desc function sum;
OK
sum(x) - Returns the sum of a set of numbers
Time taken: 0.183 seconds, Fetched: 1 row(s)
hive> create database sbux;
OK
Time taken: 0.236 seconds
hive> use sbux;
OK
Time taken: 0.033 seconds
hive> create table student(id int, name string) row format delimited fields terminated by '\t';
OK
Time taken: 0.909 seconds
hive> desc student;
OK
id                          int                                             
name                        string                                          
Time taken: 0.121 seconds, Fetched: 2 row(s)
在$HIVE_HOME下新建一個文件
#進入#HIVE_HOME目錄
[root@apollo hive]# cd $HIVE_HOME
#新建文件student.dat
[root@apollo hive]# touch student.dat
#在文件中添加如下內容
[root@apollo hive]# vim student.dat
001        david
002        fab
003        kaishen
004        josen
005        arvin
006        wada
007        weda
008        banana
009        arnold
010        simon
011        scott
.導入數據
hive> load data local inpath '/hadoop/hive/student.dat' into table sbux.student;
Loading data to table sbux.student
OK
Time taken: 8.641 seconds
hive> use sbux;
OK
Time taken: 0.052 seconds
hive> select * from student;
OK
1        david
2        fab
3        kaishen
4        josen
5        arvin
6        wada
7        weda
8        banana
9        arnold
10        simon
11        scott
NULL        NULL
Time taken: 2.217 seconds, Fetched: 12 row(s)

(8)在界面上查看剛剛寫入的hdfs數據

在hadoop的namenode上查看:

<ignore_js_op style="overflow-wrap: break-word; color: rgb(68, 68, 68); font-family: "Microsoft Yahei", tahoma, arial, "Hiragino Sans GB", 宋體, sans-serif;">
在mysql的hive數據里查看

[root@hadoop1 bin]# mysql -u root -p
Enter password:
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| metastore |
| mysql |
| performance_schema |
| sys |
+--------------------+
5 rows in set (0.00 sec)
mysql> use metastore;
Database changed
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| 1 | 1551178545 | 6 | 0 | root | 0 | 1 | student | MANAGED_TABLE | NULL | NULL | |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
1 row in set (0.00 sec)

7、Zookeeper安裝

上傳解壓:

[root@hadoop ~]# cd /hadoop/
[root@hadoop hadoop]# pwd
/hadoop
[root@hadoop hadoop]# mkdir zookeeper
[root@hadoop hadoop]# cd zookeeper/
[root@hadoop zookeeper]# tar -zxvf zookeeper-3.4.6.tar.gz
。。
[root@hadoop zookeeper]# ls
zookeeper-3.4.6 zookeeper-3.4.6.tar.gz
[root@hadoop zookeeper]# rm -rf *gz
[root@hadoop zookeeper]# mv zookeeper-3.4.6/* .
[root@hadoop zookeeper]# ls
bin CHANGES.txt contrib docs ivy.xml LICENSE.txt README_packaging.txt recipes zookeeper-3.4.6 zookeeper-3.4.6.jar.asc zookeeper-3.4.6.jar.sha1
build.xml conf dist-maven ivysettings.xml lib NOTICE.txt README.txt src zookeeper-3.4.6.jar zookeeper-3.4.6.jar.md5

配置配置文件

創建快照日志存放目錄:
mkdir -p /hadoop/zookeeper/dataDir

【注意】:如果不配置dataLogDir,那么事務日志也會寫在dataDir目錄中。這樣會嚴重影響zk的性能。因為在zk吞吐量很高的時候,產生的事務日志和快照日志太多。

[root@hadoop zookeeper]# cd conf/
[root@hadoop conf]# mv zoo_sample.cfg zoo.cfg
[root@hadoop conf]# cat /hadoop/zookeeper/conf/zoo.cfg |grep -v ^#|grep -v ^$
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/hadoop/zookeeper/dataDir
dataLogDir=/hadoop/zookeeper/dataLogDir
clientPort=2181
server.1=192.168.1.66:2887:3887

在我們配置的dataDir指定的目錄下面,創建一個myid文件,里面內容為一個數字,用來標識當前主機,conf/zoo.cfg文件中配置的server.X中X為什么數字,則myid文件中就輸入這個數字:

[root@hadoop conf]# echo "1" > /hadoop/zookeeper/dataDir/myid

啟動zookeeper:

[root@hadoop zookeeper]# cd bin/
[root@hadoop bin]# ./zkServer.sh start
JMX enabled by default
Using config: /hadoop/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hadoop bin]# ./zkServer.sh status
JMX enabled by default
Using config: /hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: standalone
[root@hadoop bin]# ./zkCli.sh -server localhost:2181
Connecting to localhost:2181
2019-03-12 11:47:29,355 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2019-03-12 11:47:29,360 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=hadoop
2019-03-12 11:47:29,361 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_151
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_151/jre
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/hadoop/zookeeper/bin/../build/classes:/hadoop/zookeeper/bin/../build/lib/*.jar:/hadoop/z
ookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/hadoop/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/hadoop/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/hadoop/zookeeper/bin/../lib/log4j-1.2.16.jar:/hadoop/zookeeper/bin/../lib/jline-0.9.94.jar:/hadoop/zookeeper/bin/../zookeeper-3.4.6.jar:/hadoop/zookeeper/bin/../src/java/lib/*.jar:/hadoop/zookeeper/bin/../conf:.:/usr/java/jdk1.8.0_151/lib/dt.jar:/usr/java/jdk1.8.0_151/lib/tools.jar2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2019-03-12 11:47:29,364 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-957.el7.x86_64
2019-03-12 11:47:29,365 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2019-03-12 11:47:29,365 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2019-03-12 11:47:29,365 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/hadoop/zookeeper/bin
2019-03-12 11:47:29,366 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyW
atcher@799f7e29Welcome to ZooKeeper!
2019-03-12 11:47:29,402 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@975] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authe
nticate using SASL (unknown error)JLine support is enabled
2019-03-12 11:47:29,494 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@852] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2019-03-12 11:47:29,519 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1235] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1696f
feb12f0000, negotiated timeout = 30000
WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]

[root@hadoop bin]# jps
12467 QuorumPeerMain
11060 JobHistoryServer
10581 ResourceManager
12085 RunJar
10102 NameNode
12534 Jps
10376 SecondaryNameNode
10201 DataNode
11994 RunJar
10683 NodeManager

發現zookeeper正常起來了

8、Kafka安裝

上傳解壓:

[root@hadoop bin]# cd /hadoop/
[root@hadoop hadoop]# mkdir kafka
[root@hadoop hadoop]# cd kafka/
[root@hadoop kafka]# ls
kafka_2.11-1.1.1.tgz
[root@hadoop kafka]# tar zxf kafka_2.11-1.1.1.tgz
[root@hadoop kafka]# mv kafka_2.11-1.1.1/* .
[root@hadoop kafka]# ls
bin  config  kafka_2.11-1.1.1  kafka_2.11-1.1.1.tgz  libs  LICENSE  NOTICE  site-docs
[root@hadoop kafka]# rm -rf *tgz
[root@hadoop kafka]# ls
bin  config  kafka_2.11-1.1.1  libs  LICENSE  NOTICE  site-docs

修改配置文件:

[root@hadoop kafka]# cd config/
[root@hadoop config]# ls
connect-console-sink.properties    connect-file-sink.properties    connect-standalone.properties  producer.properties     zookeeper.properties
connect-console-source.properties  connect-file-source.properties  consumer.properties            server.properties
connect-distributed.properties     connect-log4j.properties        log4j.properties               tools-log4j.properties
[root@hadoop config]# vim server.properties

配置如下:

[root@hadoop config]#  cat server.properties |grep -v ^#|grep -v ^$
broker.id=0
listeners=PLAINTEXT://192.168.1.66:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/hadoop/kafka/logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.1.66:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
delete.topic.enble=true -----如果不指定這個參數,執行刪除操作只是標記刪除

啟動kafka

[root@hadoop kafka]# nohup bin/kafka-server-start.sh config/server.properties&

查看nohup文件有沒有錯誤信息,沒錯就沒問題。

驗證kafka,為了日后操作方便,先來編輯幾個常用腳本:

--消費者消費指定topic數據
[root@hadoop kafka]# cat console.sh
#!/bin/bash
read -p "input topic:" name

bin/kafka-console-consumer.sh --zookeeper 192.168.1.66:2181 --topic $name --from-beginning
--列出當前所有topic
[root@hadoop kafka]# cat list.sh
#!/bin/bash
bin/kafka-topics.sh -describe -zookeeper 192.168.1.66:2181
--生產者指定topic生產數據
[root@hadoop kafka]# cat productcmd.sh
#!/bin/bash
read -p "input topic:" name

bin/kafka-console-producer.sh --broker-list 192.168.1.66:9092 --topic $name
--啟動kafka
[root@hadoop kafka]# cat startkafka.sh
#!/bin/bash
nohup bin/kafka-server-start.sh  config/server.properties&
關閉kafka
[root@hadoop kafka]# cat stopkafka.sh
#!/bin/bash
bin/kafka-server-stop.sh
sleep 6
jps
--創建topic
[root@hadoop kafka]# cat create.sh
read -p "input topic:" name
bin/kafka-topics.sh --create --zookeeper 192.168.1.66:2181 --replication-factor 1 --partitions 1 --topic $name

接下來驗證kafka可用性:

會話1創建topic

[root@hadoop kafka]# ./create.sh
input topic:test
Created topic "test".

查看創建的topic

[root@hadoop kafka]# ./list.sh
Topic:test        PartitionCount:1        ReplicationFactor:1        Configs:
        Topic: test        Partition: 0        Leader: 0        Replicas: 0        Isr: 0

會話1指定test生產數據:

[root@hadoop kafka]# ./productcmd.sh
input topic:test
>test  
>

會話2指定test消費數據:

[root@hadoop kafka]# ./console.sh
input topic:test
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper]
.test

測試可以正常生產和消費。

將kafka和zookeeper相關環境變量加到/etc/profile,並source使其生效。

export ZOOKEEPER_HOME=/hadoop/zookeeper
export KAFKA_HOME=/hadoop/kafka

9、Hbase安裝

下載連接:
http://archive.apache.org/dist/hbase/
(1)創建安裝目錄並上傳解壓:

 

[root@hadoop hbase]# tar -zxvf hbase-1.4.9-bin.tar.gz
[root@hadoop hbase]# ls
hbase-1.4.9  hbase-1.4.9-bin.tar.gz
[root@hadoop hbase]# rm -rf *gz
mv [root@hadoop hbase]# mv hbase-1.4.9/* .

[root@hadoop hbase]# pwd
/hadoop/hbase
[root@hadoop hbase]# ls
bin          conf  hbase-1.4.9    LEGAL  LICENSE.txt  README.txt
CHANGES.txt  docs  hbase-webapps  lib    NOTICE.txt

 

(2)環境變量配置,我的環境變量如下:

export JAVA_HOME=/usr/java/jdk1.8.0_151
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/hadoop/
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export HIVE_HOME=/hadoop/hive
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export HCAT_HOME=$HIVE_HOME/hcatalog
export HIVE_DEPENDENCY=/hadoop/hive/conf:/hadoop/hive/lib/*:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-core-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-server-extensions-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-streaming-2.3.3.jar:/hadoop/hive/lib/hive-exec-2.3.3.jar
export HBASE_HOME=/hadoop/hbase/
export ZOOKEEPER_HOME=/hadoop/zookeeper
export KAFKA_HOME=/hadoop/kafka
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HCAT_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME:$KAFKA_HOME
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:${HIVE_HOME}/lib:$HBASE_HOME/lib

詳細配置

修改conf/hbase-env.sh中的HBASE_MANAGES_ZK為false:

[root@hadoop kafka]# cd /hadoop/hbase/
[root@hadoop hbase]# ls
bin          conf  hbase-1.4.9    LEGAL  LICENSE.txt  README.txt
CHANGES.txt  docs  hbase-webapps  lib    NOTICE.txt

修改hbase-env.sh文件加入下面內容

[root@hadoop hbase]# vim conf/hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_151
export HADOOP_HOME=/hadoop/
export HBASE_HOME=/hadoop/hbase/
export HBASE_MANAGES_ZK=false

修改配置文件hbase-site.xml

在該配置文件中可以給hbase配置一個臨時目錄,這里指定為mkdir /root/hbase/tmp,先執行命令創建文件夾。

mkdir  /root/hbase
mkdir  /root/hbase/tmp
mkdir  /root/hbase/pids

在<configuration>節點內增加以下配置:

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://192.168.1.66:9000/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/hadoop/zookeeper/dataDir</value>
  </property>
  <property>
                <name>hbase.zookeeper.quorum</name>
                <value>192.168.1.66</value>
                <description>the pos of zk</description>
        </property>
        <!-- 此處必須為true,不然hbase仍用自帶的zk,若啟動了外部的zookeeper,會導致沖突,hbase啟動不起來 -->
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>
        <!-- hbase主節點的位置 -->
        <property>
                <name>hbase.master</name>
                <value>192.168.1.66:60000</value>
        </property>
</configuration>
[root@hadoop hbase]# cat conf/regionservers
192.168.1.66
[root@hadoop hbase]# cp /hadoop/zookeeper/conf/zoo.cfg  /hadoop/hbase/conf/

啟動hbase

[root@hadoop bin]# ./start-hbase.sh
running master, logging to /hadoop/hbase//logs/hbase-root-master-hadoop.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
: running regionserver, logging to /hadoop/hbase//logs/hbase-root-regionserver-hadoop.out
: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
--查看hbase相關進程HMaster、HRegionServer 已經起來了
[root@hadoop bin]# jps
12449 QuorumPeerMain
13094 Kafka
10376 SecondaryNameNode
12046 RunJar
11952 RunJar
11060 JobHistoryServer
10581 ResourceManager
10102 NameNode
10201 DataNode
10683 NodeManager
15263 HMaster
15391 HRegionServer
15679 Jps

10、安裝KYLIN

下載連接
http://kylin.apache.org/cn/download/
(1)上傳解壓

[root@hadoop kylin]# pwd
/hadoop/kylin
[root@hadoop kylin]# ls
apache-kylin-2.4.0-bin-hbase1x.tar.gz
[root@hadoop kylin]# tar -zxvf apache-kylin-2.4.0-bin-hbase1x.tar.gz
[root@hadoop kylin]# rm -rf  apache-kylin-2.4.0-bin-hbase1x.tar.gz
[root@hadoop kylin]#
[root@hadoop kylin]# mv apache-kylin-2.4.0-bin-hbase1x/* .
[root@hadoop kylin]# ls
apache-kylin-2.4.0-bin-hbase1x  bin  commit_SHA1  conf  lib  sample_cube  spark  tomcat  tool

(2)配置環境變量
/etc/profile內容如下

export JAVA_HOME=/usr/java/jdk1.8.0_151
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/hadoop/
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export HIVE_HOME=/hadoop/hive
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export HCAT_HOME=$HIVE_HOME/hcatalog
export HIVE_DEPENDENCY=/hadoop/hive/conf:/hadoop/hive/lib/*:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-core-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-server-extensions-2.3.3.jar:/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-streaming-2.3.3.jar:/hadoop/hive/lib/hive-exec-2.3.3.jar
export HBASE_HOME=/hadoop/hbase/
export ZOOKEEPER_HOME=/hadoop/zookeeper
export KAFKA_HOME=/hadoop/kafka
export KYLIN_HOME=/hadoop/kylin/
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HCAT_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME:$KAFKA_HOME:$KYLIN_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:${HIVE_HOME}/lib:$HBASE_HOME/lib:$KYLIN_HOME/lib
   [root@hadoop kylin]# source /etc/profile

(3)修改kylin.properties內容

[root@hadoop kylin]# vim conf/kylin.properties
kylin.rest.timezone=GMT+8
kylin.rest.servers=192.168.1.66:7070
kylin.job.jar=/hadoop/kylin/lib/kylin-job-2.4.0.jar
kylin.coprocessor.local.jar=/hadoop/kylin/lib/kylin-coprocessor-2.4.0.jar
kyin.server.mode=all
kylin.rest.servers=192.168.1.66:7070

(4)編輯kylin_hive_conf.xml

[root@hadoop kylin]# vim conf/kylin_hive_conf.xml
<property>
        <name>hive.exec.compress.output</name>
        <value>false</value>
        <description>Enable compress</description>
    </property>

(5)編輯server.xml

 

[root@hadoop kylin]# vim tomcat/conf/server.xml
注釋掉下面這點代碼:
<!-- Connector port="7443" protocol="org.apache.coyote.http11.Http11Protocol"
                   maxThreads="150" SSLEnabled="true" scheme="https" secure="true"
                   keystoreFile="conf/.keystore" keystorePass="changeit"
                   clientAuth="false" sslProtocol="TLS" /> -->

(6)編輯kylin.sh

#additionally add tomcat libs to HBASE_CLASSPATH_PREFIX
    export HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_dependency:${HBASE_CLASSPATH_PREFIX}

(7)啟動kylin

[root@hadoop kylin]# cd bin/
[root@hadoop bin]# pwd
/hadoop/kylin/bin
[root@hadoop bin]# ./check-env.sh
Retrieving hadoop conf dir...
KYLIN_HOME is set to /hadoop/kylin
[root@hadoop bin]# ./kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /hadoop/kylin
Retrieving hive dependency...
。。。。。。。。。。。
A new Kylin instance is started by root. To stop it, run 'kylin.sh stop'
Check the log at /hadoop/kylin/logs/kylin.log
Web UI is at http://<hostname>:7070/kylin
[root@hadoop bin]# jps
13216 HMaster
10376 SecondaryNameNode
12011 RunJar
11918 RunJar
13070 HQuorumPeer
11060 JobHistoryServer
10581 ResourceManager
31381 RunJar
10102 NameNode
13462 HRegionServer
10201 DataNode
10683 NodeManager
31677 Jps

至此,安裝已經完成,大家可以通過http://:7070/kylin去訪問kylin了,至於cube及steam cube的官方案例,因為文章長度原因,筆者寫到了這篇文章供參考:
hadoop+kylin安裝及官方cube/steam cube案例文檔
8)初步驗證及使用:

測試創建項目從hive庫取表:

打開網頁:http://192.168.1.66:7070/kylin/login
初始密碼:ADMIN/KYLIN


由頂部菜單欄進入 Model 頁面,然后點擊 Manage Projects。

點擊 + Project 按鈕添加一個新的項目。


在頂部菜單欄點擊 Model,然后點擊左邊的 Data Source 標簽,它會列出所有加載進 Kylin 的表,點擊 Load Table 按鈕。

輸入表名並點擊 Sync 按鈕提交請求。

接下來就可以看到導入的表結構了:

(2)、運行官方案例:

root@hadoop bin]# pwd
    /hadoop/kylin/bin
    [root@hadoop bin]# ./sample.sh
    Retrieving hadoop conf dir...
    。。。。。。。。。
    Sample cube is created successfully in project 'learn_kylin'.
    Restart Kylin Server or click Web UI => System Tab => Reload Metadata to take effect  

看到上面最后兩條信息就說明案例使用的hive表都創建好了,接下來重啟kylin或則 reload metadata
再次刷新頁面:


選擇第二個kylin_sales_cube

選擇bulid,隨意選擇一個12年以后的日期

然后切換到monitor界面:

等待cube創建完成。

做sql查詢

編輯整個環境重啟腳本方便日常啟停:
環境停止腳本

[root@hadoop hadoop]# cat stop.sh
#!/bin/bash
echo -e "\n========Start stop kylin========\n"
$KYLIN_HOME/bin/kylin.sh stop
sleep 5
echo -e "\n========Start stop hbase========\n"
$HBASE_HOME/bin/stop-hbase.sh
sleep 5
echo -e "\n========Start stop kafka========\n"
$KAFKA_HOME/bin/kafka-server-stop.sh  $KAFKA_HOME/config/server.properties
sleep 3
echo -e "\n========Start stop zookeeper========\n"
$ZOOKEEPER_HOME/bin/zkServer.sh stop
sleep 3
echo -e "\n========Start stop jobhistory========\n"
mr-jobhistory-daemon.sh stop historyserver
sleep 3
echo -e "\n========Start stop yarn========\n"
stop-yarn.sh
sleep 5
echo -e "\n========Start stop dfs========\n"
stop-dfs.sh
sleep 5
echo -e "\n========Start stop prot========\n"
`lsof -i:9083|awk 'NR>=2{print "kill -9 "$2}'|sh`
`lsof -i:10000|awk 'NR>=2{print "kill -9 "$2}'|sh`
sleep 2
echo -e "\n========Check process========\n"
jps

環境啟動腳本

[root@hadoop hadoop]# cat start.sh
#!/bin/bash
echo -e "\n========Start run dfs========\n"
start-dfs.sh
sleep 5
echo -e "\n========Start run yarn========\n"
start-yarn.sh
sleep 3
echo -e "\n========Start run jobhistory========\n"
mr-jobhistory-daemon.sh start historyserver
sleep 2
echo -e "\n========Start run metastore========\n"
nohup hive --service metastore >> ~/metastore.log 2>&1 &  
sleep 10
echo -e "\n========Start run hiveserver2========\n"
nohup  hive --service hiveserver2 >> ~/hiveserver2.log 2>&1 &
sleep 10
echo -e "\n========Check Port========\n"
netstat  -lnp|grep 9083
sleep 5
netstat  -lnp|grep 10000
sleep 2
echo -e "\n========Start run zookeeper========\n"
$ZOOKEEPER_HOME/bin/zkServer.sh start
sleep 5
echo -e "\n========Start run kafka========\n"
$KAFKA_HOME/bin/kafka-server-start.sh  $KAFKA_HOME/config/server.properties
sleep 5
echo -e "\n========Start run hbase========\n"
$HBASE_HOME/bin/start-hbase.sh
sleep 5
echo -e "\n========Check process========\n"
jps
sleep 1
echo -e "\n========Start run kylin========\n"
$KYLIN_HOME/bin/kylin.sh start

11.安裝scala

解壓scala安裝包到任意目錄
cd /home/tom
$ tar -xzvf scala-2.10.6.tgz

/etc/profile文件的末尾添加環境變量:

export SCALA_HOME=/home/tom//scala-2.10.6
export PATH=$SCALA_HOME/bin:$PATH

保存並更新/etc/profile

 source /etc/profile

查看是否成功:

scala -version

12.安裝Spark

解壓spark安裝包到任意目錄:
cd /home/tom
$ tar -xzvf spark-1.6.0-bin-hadoop2.6.tgz
$ mv spark-1.6.0-bin-hadoop2.6 spark-1.6.0
$ sudo vim /etc/profile[/mw_shl_code]

/etc/profile文件的末尾添加環境變量:

export SPARK_HOME=/home/tom/spark-1.6.0
export PATH=$SPARK_HOME/bin:$PATH

保存並更新/etc/profile

source /etc/profile

在conf目錄下復制並重命名spark-env.sh.templatespark-env.sh

cp spark-env.sh.template spark-env.sh
$ vi spark-env.sh

spark-env.sh中添加:

export JAVA_HOME=/home/tom/jdk1.8.0_73/
export SCALA_HOME=/home/tom//scala-2.10.6
export SPARK_MASTER_IP=localhost
export SPARK_WORKER_MEMORY=4G

啟動

$SPARK_HOME/sbin/start-all.sh

停止

$SPARK_HOME/sbin/stop-all.sh

測試Spark是否安裝成功:

$SPARK_HOME/bin/run-example SparkPi

檢查WebUI,瀏覽器打開端口:http://localhost:8080

查看集群環境
http://master:8080/  訪問正常

進入spark-shell
$spark-shell   執行正常如下圖


查看jobs等信息
http://master:4040/jobs/  訪問正常。

 13、Flink安裝

一:安裝

Flink官網下載地址:https://flink.apache.org/downloads.html

選擇1.6.3版本

 下載:

wget http://mirrors.hust.edu.cn/apache/flink/flink-1.7.1/flink-1.7.1-bin-hadoop26-scala_2.11.tgz

解壓:

tar -zxvf flink-1.6.3-bin-hadoop26-scala_2.11.tgz
mv  flink-1.6.3 flink

查看本機host

進入flink目錄,修改conf/flink-conf.yaml文件

vim  conf/flink-conf.yaml

修改conf/masters文件,修改后內容如下:

啟動單機版flink:

bin/start-cluster.sh

啟動界面如下:

查看啟動是否成功:

 查看dashboard界面:http://192.168.186.129:808

 

二:官方案例演示

1.啟動一個終端輸入如下指令:

nc -lk 8000

2.啟動第二個終端,執行flink自帶的wordcount案例

bin/flink run examples/streaming/SocketWindowWordCount.jar --port 8000

3.在第一個終端發送數據:

4.測試結果保存在log/flink-root-taskexecutor-0-woniu.out文件中

5. Dashboard也可以看到任務信息

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM