1 系統環境
- 搭建的系統環境為centos7.5。
root@localhost ~]# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.5.1804 (Core)
Release: 7.5.1804
Codename: Core
2 修改主機名
2.1 主機名修改為hadoop1。
[root@localhost ~]# hostnamectl set-hostname hadoop1
2.2 查看主機名
[root@localhost ~]# hostnamectl
Static hostname: hadoop1
Icon name: computer-vm
Chassis: vm
Machine ID: a34d80dce9364980962f9d85ffb5e9c8
Boot ID: d624e2a84dc34619bfa2fe90e88eb058
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-862.11.6.el7.x86_64
Architecture: x86-64
2.3 確認是否修改成功
[root@localhost ~]# hostnamectl --static
hadoop1
- 重新登陸后主機名已更改
3 添加hadoop用戶
- 本次用hadoop用戶部署,需要添加hadoop用戶,密碼也設置為hadoop。
[root@hadoop1 ~]# sudo useradd -m hadoop -s /bin/bash
[root@hadoop1 ~]# sudo passwd hadoop
- 登陸
[root@hadoop1 ~]# ssh hadoop@hadoop1
# 輸入密碼登陸成功
4 設置免密登陸
- 注意:這里免密登陸指的是hadoop賬戶登陸的hadoop1,再ssh hadoop@hadoop1。一定要設置免密登錄,個人理解和各個機器之間互信一樣的道理。
4.1 生成密鑰
[hadoop@hadoop1 ~]$ ssh-keygen -t rsa # 三次回車
[hadoop@hadoop1 ~]$ ssh-copy-id hadoop@hadoop1 # 輸入密碼
4.2 修改/etc/hosts文件
[root@hadoop1 ~]# vim /etc/hosts
在第一行添加: 本機IP hadoop1 的映射,如下:
172.16.142.129 hadoop1
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
4.3 測試免密登陸成功
[hadoop@hadoop1 ~]$ ssh hadoop@hadoop1
Warning: Permanently added the ECDSA host key for IP address '172.16.142.129' to the list of known hosts.
Last login: Sun Jul 21 16:45:14 2019 from 172.16.142.129
5 安裝jdk1.8
5.1 說明
- 本次安裝的是JDK1.8,具體版本為jdk-8u101-linux-x64.tar.gz,使用root安裝。
5.2 下載jdk-8u101-linux-x64.tar.gz
[root@hadoop1 ~]# wget https://dl.cactifans.com/jdk/jdk-8u101-linux-x64.tar.gz
5.3 解壓到/usr/local/下
[root@hadoop1 ~]# tar -zxvf jdk-8u101-linux-x64.tar.gz -C /usr/local/
5.4 配置JDK環境變量
- 編輯/etc/profile
[root@hadoop1 ~]# vim /etc/profile
添加如下內容:
export JAVA_HOME=/usr/local/jdk1.8.0_101
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
- 使配置立即生效
[root@hadoop1 ~]# source /etc/profile
5.5 查看java信息,驗證安裝成功
[root@hadoop1 ~]# java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
6 安裝hadoop-2.7.6
6.1 說明
- 本次安裝的是hadoop-2.7.6,使用hadoop安裝,所以先以hadoop用戶登陸,ssh hadoop@hadoop1。
6.2 下載
[hadoop@hadoop1 ~]$ wget http://apache.fayea.com/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz
6.3 解壓到/home/hadoop/apps下
[hadoop@hadoop1 ~]$ mkdir -p ~/apps
[hadoop@hadoop1 ~]$ tar -zxvf hadoop-2.7.6.tar.gz -C /home/hadoop/apps
6.4 修改hadoop-env.sh
[hadoop@hadoop1 ~]$ cd /home/hadoop/apps/hadoop-2.7.6/etc/hadoop
[hadoop@hadoop1 hadoop]$ vim hadoop-env.sh
- 修改export JAVA_HOME=${JAVA_HOME}為:
export JAVA_HOME=/usr/local/jdk1.8.0_101
6.5 修改core-site.xml
[hadoop@hadoop1 hadoop]$ vim core-site.xml
- 添加如下配置:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/hadoopdata</value>
</property>
</configuration>
如圖所示:
6.6 修改hdfs-site.xml
[hadoop@hadoop1 hadoop]$ vim hdfs-site.xml
添加如下配置:
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/hadoopdata/name</value>
<description>為了保證元數據的安全一般配置多個不同目錄</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/hadoopdata/data</value>
<description>datanode 的數據存儲目錄</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>HDFS 的數據塊的副本存儲個數, 默認是3</description>
</property>
如圖所示:
6.7 修改mapred-site.xml
[hadoop@hadoop1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop1 hadoop]$ vim mapred-site.xml
添加如下配置:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
6.8 修改yarn-site.xml
[hadoop@hadoop1 hadoop]$ vim yarn-site.xml
添加如下配置:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>YARN 集群為 MapReduce 程序提供的 shuffle 服務</description>
</property>
如圖所示:
6.9 Hadoop環境配置
- 注意:由於是用hadoop用戶登陸的,環境變量是~/.bashrc
[hadoop@hadoop1 hadoop]$ vim ~/.bashrc
- 添加如下配置:
# HADOOP_HOME
export HADOOP_HOME=/home/hadoop/apps/hadoop-2.7.6
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
- 配置立即生效
[hadoop@hadoop1 hadoop]$ source ~/.bashrc
6.10 查看Hadoop版本
[hadoop@hadoop1 hadoop]$ hadoop version
Hadoop 2.7.6
Subversion https://shv@git-wip-us.apache.org/repos/asf/hadoop.git -r 085099c66cf28be31604560c376fa282e69282b8
Compiled by kshvachk on 2018-04-18T01:33Z
Compiled with protoc 2.5.0
From source with checksum 71e2695531cb3360ab74598755d036
This command was run using /home/hadoop/apps/hadoop-2.7.6/share/hadoop/common/hadoop-common-2.7.6.jar
6.11 創建hdfs-site.xml里配置的路徑
[hadoop@hadoop1 hadoop]$ mkdir -p /home/hadoop/data/hadoopdata/name
[hadoop@hadoop1 hadoop]$ mkdir -p /home/hadoop/data/hadoopdata/data
6.12 Hadoop的初始化
[hadoop@hadoop1 hadoop]$ hadoop namenode -format
- 如圖所示出現status 0即為初始化成功。
6.13 啟動
[hadoop@hadoop1 hadoop]$ cd /home/hadoop/apps/hadoop-2.7.6
[hadoop@hadoop1 hadoop-2.7.6]$ sbin/start-dfs.sh
[hadoop@hadoop1 hadoop-2.7.6]$ sbin/start-yarn.sh
6.14 訪問WebUI
- 注意:關閉防火牆
# 訪問之前關閉防火牆
[root@hadoop1 ~]# systemctl stop firewalld
- 瀏覽器訪問http://IP:50070
7 安裝scala(可選)
7.1 說明
- 用root用戶安裝,版本為scala-2.12.0。
7.2 下載
[root@hadoop1 ~]# wget https://downloads.lightbend.com/scala/2.12.0/scala-2.12.0.tgz
7.2 解壓至/usr/local/
[root@hadoop1 ~]# tar -zxvf scala-2.12.0.tgz -C /usr/local/
7.4 添加環境變量
[root@hadoop1 ~]# vim /etc/profile
- 添加如下內容:
# Scala env
export SCALA_HOME=/usr/local/scala-2.12.0
export PATH=$SCALA_HOME/bin:$PATH
- 配置立即生效
[root@hadoop1 ~]# source /etc/profile
7.5 查看scala版本
[root@hadoop1 ~]# scala -version
8 安裝spark
8.1 說明
- 用hadoop用戶安裝spark-2.4.3-bin-hadoop2.7.tgz。
8.2 下載
[hadoop@hadoop1 ~]$ wget http://apache.claz.org/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
8.3 解壓
[hadoop@hadoop1 ~]$ tar -zxvf spark-2.4.3-bin-hadoop2.7.tgz -C ~/apps/
- 創建軟鏈
[hadoop@hadoop1 ~]$ cd ~/apps/
[hadoop@hadoop1 apps]$ ln -s spark-2.4.3-bin-hadoop2.7 spark
8.4 配置spark
[hadoop@hadoop1 apps]$ cd spark/conf
[hadoop@hadoop1 conf]$ cp spark-env.sh.template spark-env.sh
[hadoop@hadoop1 conf]$ vim spark-env.sh
- 添加如下內容:
export JAVA_HOME=/usr/local/jdk1.8.0_101
export SCALA_HOME=/usr/local/scala-2.12.0
export HADOOP_HOME=/home/hadoop/apps/hadoop-2.7.6
export HADOOP_CONF_DIR=/home/hadoop/apps/hadoop-2.7.6/etc/hadoop
export SPARK_MASTER_IP=hadoop1
export SPARK_MASTER_PORT=7077
8.5 配置環境變量
[hadoop@hadoop1 conf]$ vim ~/.bashrc
- 輸入如下內容
#SPARK_HOME
export SPARK_HOME=/home/hadoop/apps/spark
export PATH=$PATH:$SPARK_HOME/bin
- 配置立即生效
[hadoop@hadoop1 conf]$ source ~/.bashrc
8.6 啟動spark
[hadoop@hadoop1 conf]$ ~/apps/spark/sbin/start-all.sh
- 查看進程