1.基本概念了解
01.大數據集群
HDFS集群:
負責海量數據的存儲,
集群中的角色主要有 NameNode / DataNode
YARN集群:
負責海量數據運算時的資源調度,
集群中的角色主要有 ResourceManager /NodeManager
HDFS集群和YARN集群,兩者邏輯上分離,但物理上常在一起
Spark集群
負責海量數據運算,
集群中的角色主要有 Master Worker
driver executor
02.各個服務和IP的對應關系
03.端口
HDFS頁面: 50070
YARN的web管理界面: 8088
YARN的ResourceManager 的application manager端口: 8032
HistoryServer的管理界面:19888
Hive webUI 頁面: 10002
spark webUI 端口 8080
Zookeeper的服務端口號:2181
ZooKeeper自帶的基本命令進行增刪改查 沒看到自帶的webUI
2.配置Work內容
work : /tmp/dolphinscheduler/exec/process/
1.配置HDFS服務
01.客戶端-以及客戶端的配置文件
docker cp ~/soft/work_conf_hdfs_yarn/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
02.配置文件在以下文件里:
進入-- docker exec -it docker-swarm-dolphinscheduler-worker-1 /bin/bash
hostname -p
core_site.xml,
hdfs_site.xml,
mapred_site.xml
yarn-site.xml
其中:
core_site.xml
fs.default.name
hdfs_site.xml
mapred_site.xml
yarn-site.xml
yarn.resourcemanager.address
03.配置本地的環境
環境變量
04.集群的內容
了解集群的一些配置項-方便配置到本地
05.具體配置
在配置文件mapred-site.xml中加入這個值的賦值
<property>
<name>hdp.version</name>
<value>3.0.1.0-187</value>
</property>
2.配置Spark服務
Spark 為各種集群管理器提供了統一的工具來提交作業,這個工具就是 spark-submit
以集群模式運行。可以通過指定 --master 參數
配置文件地址
運行方式
Spark-Local(client)
Spark-YARN(cluster)
錯誤類型
1.$JAVA_HOME 不存在
- -> welcome to use bigdata scheduling system...
ERROR: JAVA_HOME is not set and could not be found.
排查和解決方式
01.work各個節點
輸入java –version,查看jdk是否安裝成功
輸入export,查看jdk環境變量是否設置成功
02、在集群環境下,即使各結點都正確地配置了JAVA_HOME,
解決方法: 在hadoop-env.sh中,再顯示地重新聲明一遍JAVA_HOME
export JAVA_HOME=/usr/local/openjdk-8
2.ResourceManager的地址配置
INFO retry.RetryInvocationHandler: java.net.ConnectException:
Call From fac4f*d3**/***.**.0.5 to **.**.**.**:*32 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused,
while invoking ApplicationClientProtocolPBClientImpl.getNewApplication over null after 1 failover attempts.
Trying to failover after sleeping for 30973ms.
3.mr-framework 錯誤
java.lang.IllegalArgumentException: Could not locate MapReduce framework name 'mr-framework' in mapreduce.application.classpath
java.lang.IllegalArgumentException: Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI,
check the setting for mapreduce.application.framework.path
解決方式
mr程序的工程中不要有參數 mapreduce.framework.name 的設置
4.ENOENT: No such file or directory
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
查看輸入的參數 注意空格
INFO mapred.MapTask: Processing split: hdfs://**.**.**.**:*20/data/mywork.txt:0+37
5.Spark 運行local 以及Cluster成功, 運行client不成功。
cluster 運行成功
client 運行不成功--原因是容器化部署,work和其他slave的通信要用到host,但spark其他節點並不知道上傳的work
發現集群節點要連接我本機,然后將我的任務pi.py,傳到節點臨時目錄/tmp/spark-xxx/,並拷貝到$SPARM_HOME/work/下才真正執行
Exception in thread "main" org.apache.spark.SparkException:
When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
/opt/soft/spark2/bin/spark-class: line 71: /usr/jdk64/java/bin/java: No such file or directory
6.Permission denied
org.apache.hadoop.security.AccessControlException: Permission denied: user=linuxdis, access=EXECUTE, inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
解決方式
/tmp/hadoop-yarn
刪除臨時文件 hdfs的臨時文件
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=linuxdis, access=EXECUTE,
inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=EXECUTE, inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
7.文件
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://**.**.**.**:*20/d/out2 already exists
DS的work配置HDFS
配置work上的HDFS信息以及yarn信息
docker cp ~/work_conf_hdfs_yarn/core-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
docker cp ~/work_conf_hdfs_yarn/mapred-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
docker cp ~/work_conf_hdfs_yarn/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
docker cp ~/work_conf_hdfs_yarn/hadoop-env.sh docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://2.2.2.12:*20</value>
<final>true</final>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>2.2.2.11</value>
</property>
<property>
<name>hadoop.proxyuser.yarn.hosts</name>
<value>2.2.2.12,2.2.2.13</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>hdp.version</name>
<value>3.37</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/usr/hdp/3.37/hadoop/conf:/usr/hdp/3.37/hadoop/lib/*:/usr/hdp/3.37/hadoop/.//*:/usr/hdp/3.37/hadoop-hdfs/./:/usr/hdp/3.37/hadoop-hdfs/lib/*:/usr/hdp/3.37/hadoop-hdfs/.//*:/usr/hdp/3.37/hadoop-mapreduce/lib/*:/usr/hdp/3.37/hadoop-mapreduce/.//*:/usr/hdp/3.37/hadoop-yarn/./:/usr/hdp/3.37/hadoop-yarn/lib/*:/usr/hdp/3.37/hadoop-yarn/./*</value> </property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>2.2.2.13</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>2.2.2.13:*50</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>2.2.2.13:*25</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>2.2.2.13:*30</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark2_shuffle,</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>$HADOOP_CONF_DIR,/usr/hdp/3.37/hadoop/*,/usr/hdp/3.37/hadoop/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>
</property>
</configuration>
hadoop-env.sh 配置work的java地址
export JAVA_HOME=/usr/local/openjdk-8
DS的work配置Spark
按照正常情況,只需要配置 spark-env.sh
docker cp ~/work_conf_spark/core-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
docker cp ~/work_conf_spark/mapred-site.xmldocker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
docker cp ~/work_conf_spark/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
docker cp ~/work_conf_spark/spark-env.sh docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
配置work上的HDFS信息以及yarn信息
export YARN_CONF_DIR=/usr/hdp/3.37/hadoop/
export HADOOP_CONF_DIR=/usr/hdp/3.37/hadoop/
export SPARK_MASTER_IP=2.2.2.17
export SPARK_MASTER_PORT=7077
export SPARK_YARN_USER_ENV=/usr/hdp/3.37/hadoop-yarn/etc/hadoop/
yarn的配置有所不同
<property>
<name>yarn.resourcemanager.address</name>
<value>=2.2.2:*32</value>
</property>
配置workers
運行數據庫- SQL 任務
本地work運行 python shell hdfs
運行分布式集群 MapReduce Spark
數據庫 mysql Hive
liunx的用戶。最高權限root,或者自己建立的userA
自己在dolphinscheduler這系統中建立的操作的人
我拷到lib的MySQL驅動jar是8的,而DS用的本機MySQL是5.7的,重啟服務后系統紊亂,不知道用哪個驅動,於是連不上本機DB,自然UI里也沒法登錄
運行dataX任務
#刪除插件中下划線開頭的隱藏文件(不刪除會報找不到插件錯誤,挺奇怪)
參考
啟動hadoop,報錯Error JAVA_HOME is not set and could not be found https://www.cnblogs.com/codeOfLife/p/5940642.html
Spark Client和Cluster兩種運行模式的工作流程、基本概念 https://blog.csdn.net/m0_37758017/article/details/80469263
dolphinscheduler的現場問題,沒有選對租戶,權限不對 https://blog.csdn.net/u010978399/article/details/122987214
【Dolphinscheduler】DS提交pyspark多文件項目到yarn集群 https://blog.csdn.net/hyj_king/article/details/122976748
【DolphginSceduler】添加MySQL和Oracle數據源驚魂記 https://www.cnblogs.com/pyhy/p/15900607.html