1 大數據實戰系列-spark+hadoop集成環境搭建


1 准備環境

192.168.0.251 shulaibao1 
192.168.0.252 shulaibao2 
hadoop-2.8.0-bin 
spark-2.1.1-bin-hadoop2.7 
關閉selinux: 
/etc/selinux/config:SELINUX=disabled

增加hadoop用戶組與用戶

groupaddg1000hadoopuseradd -u 2000 -g hadoop hadoop 
mkdirp/home/data/app/hadoopchown -R hadoop:hadoop /home/data/app/hadoop 
$passwd hadoop

配置無密碼登錄

sshkeygentrsacd /home/hadoop/.ssh cpidrsa.pubauthorizedkeyshadoop1scp authorized_keys_hadoop2 
hadoop@hadoop1:/home/hadoop/.ssh scpauthorizedkeyshadoop3hadoop@hadoop1:/home/hadoop/.ssh使cat authorized_keys_hadoop1 >> 
authorized_keys 命令 使用$scp authorized_keys 
hadoop@hadoop2:/home/hadoop/.ssh把密碼文件分發出去

  • 1.1 安裝jdk

推薦jdk1.8

  • 1.2 安裝並設置protobuf

注:該程序包需要在gcc安裝完畢后才能安裝,否則提示無法找到gcc編譯器。

  • 1.2.1 下載protobuf安裝包

推薦版本2.5+ 
下載鏈接為: https://code.google.com/p/protobuf/downloads/list 
這里寫圖片描述

  • 1.2.2使用ssh工具把protobuf-2.5.0.tar.gz包上傳到/home/data/software目錄

這里寫圖片描述

1.2.3 解壓安裝包

這里寫圖片描述 
這里寫圖片描述 
這里寫圖片描述

$tar -zxvf protobuf-2.5.0.tar.gz

  • 1.2.4 把protobuf-2.5.0目錄轉移到/usr/local目錄下

$sudo mv protobuf-2.5.0 /usr/local 
這里寫圖片描述

  • 1.2.5 進行目錄運行命令

進入目錄以root用戶運行如下命令:

#./configure #make #make check #make install

這里寫圖片描述

  • 1.2.6 驗證是否安裝成功

運行成功之后,通過如下方式來驗證是否安裝成功 
這里寫圖片描述

#protoc

這里寫圖片描述

2 安裝hadoop

  • 2.1 上傳、解壓、創建目錄
tar -zxvf mkdir tmp Mdkdir name Mkdir data
  • 2.2 hadoop核心配置

配置路徑:/home/data/app/hadoop/etc/hadoop 
Core.xml

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://shulaibao1:9010</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://shulaibao1:9010</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/data/app/hadoop/hadoop-2.8.0/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> </configuration>

Hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>shulaibao1:9011</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/data/app/hadoop/hadoop-2.8.0/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/data/app/hadoop/hadoop-2.8.0/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>

Mapred-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>shulaibao1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>shulaibao1:19888</value> </property> </configuration>

 

Yarn-site.xml

<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Site specific YARN configuration properties --> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>shulaibao1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>shulaibao1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>shulaibao1:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>shulaibao1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>shulaibao1:8088</value> </property> </configuration>

 

Slaves 
shulaibao1 
shulaibao2

  • 2.2 hadoop-env.sh yarn-env.sh環境配置

/home/hadoop/.bash_profile增加環境變量

export JAVA_HOME=/home/data/software/jdk1.8.0_121 export HADOOP_HOME=/home/data/app/hadoop/hadoop-2.8.0 export PATH=$PATH:/home/data/app/hadoop/hadoop-2.8.0/bin

Hadoop-env.sh修改export

  • HADOOP_CONF_DIR={HADOOP_CONF_DIR:-"HADOOP_HOME/etc/hadoop”}
  • 2.3 分發到Scp -r source target -h -p2.4 驗證hdfs

路徑:/home/data/app/hadoop/hadoop-2.8.0/bin

  • 初始化格式化namenode

$./bin/hdfs namenode -format

  • 啟動hdfs

$./start-dfs.sh

  • Jps

Master: 
這里寫圖片描述

Slave: 
這里寫圖片描述

3 安裝spark

  • 3.1 下載並上傳並解壓
  • 3.2 基礎環境配置
/etc/profile
export SPARK_HOME=/home/data/app/hadoop/spark-2.1.1-bin-hadoop2.7 export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
  • 3.3 spark核心配置
/home/data/app/hadoop/spark-2.1.1-bin-hadoop2.7/conf/spark-env.sh export SPARK_MASTER_IP=shulaibao2 export SPARK_MASTER_PORT=7077 export SPARK_WORKER_CORES=1 export SPARK_WORKER_INSTANCES=1 export SPARK_WORKER_MEMORY=512M export SPARK_LOCAL_IP=192.168.0.251 export PYTHONH vim /home/data/app/hadoop/spark-2.1.1-bin-hadoop2.7/conf/slaves shulaibao1 shulaibao2
  • 3.4 發到其他機器

  • 3.5 啟動spark並驗證

/home/data/app/hadoop/spark-2.1.1-bin-hadoop2.7/sbin ./start-all.sh

Master: 
這里寫圖片描述 
Slave: 
這里寫圖片描述

Spark webui:http://192.168.0.252:8082/ 
這里寫圖片描述


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM