按照官方文檔安裝即可
CentOS7 上搭建 CDH(6.3.0)
官方文檔:https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/install_cm_cdh.html
LZO安裝:https://blog.csdn.net/lingeio/article/details/94438582
Sqoop
一共分三步
- Adding the Sqoop 1 Client
- Installing the JDBC Drivers for Sqoop 1
- 下載JDBC驅動,放到目錄/var/lib/sqoop/中
- Setting HADOOP_MAPRED_HOME for Sqoop 1
- 在/etc/profile中添加環境變量HADOOP_MAPRED_HOME
export HADOOP_MAPRED_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/bin export PATH=$PATH:$JAVA_HOME/bin:HADOOP_MAPRED_HOME
在/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/bin中有mapred文件
深入分析CDH的安裝目錄 https://blog.csdn.net/wj1314250/article/details/86494703
hadoop管理工具---CDH的目錄結構了解 https://blog.csdn.net/zzq900503/article/details/79045955
測試
sqoop list-databases --connect jdbc:mysql://localhost:3306 --username root --password 000000
oozie
報錯:
WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[node01] USER[yarn] GROUP[-] TOKEN[] APP[gmv] JOB[0000000-191123140141726-oozie-oozi-W] ACTION[0000000-191123140141726-oozie-oozi-W@shell-e6c8] Error starting action [shell-e6c8]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Invalid resource request! Cannot allocate containers as requested resource is greater than maximum allowed allocation. Requested resource type=[memory-mb], Requested resource=<memory:2048, vCores:1>, maximum allowed allocation=<memory:1024, vCores:2>, please note that maximum allowed allocation is calculated by scheduler based on maximum resource of registered NodeManagers, which might be less than configured maximum allocation=<memory:1024, vCores:2>
原因是yarn-site.xml中的兩處配置值太小了,不滿足作業的申請條件
把yarn-site.xml中的兩處配置加大一點:
容器內存 yarn.nodemanager.resource.memory-mb
最大容器內存 yarn.scheduler.maximum-allocation-mb
報錯:
WARN org.apache.oozie.action.hadoop.ShellActionExecutor: SERVER[node01] USER[yarn] GROUP[-] TOKEN[] APP[gmv] JOB[0000000-191123140141726-oozie-oozi-W] ACTION[0000000-191123140141726-oozie-oozi-W@shell-9dc9] Launcher exception: output.properties data exceeds its limit [2048] java.io.IOException: output.properties data exceeds its limit [2048] at org.apache.oozie.action.hadoop.LocalFsOperations.getLocalFileContentAsString(LocalFsOperations.java:86) at org.apache.oozie.action.hadoop.LauncherAM.processActionData(LauncherAM.java:521) at org.apache.oozie.action.hadoop.LauncherAM.handleActionData(LauncherAM.java:501) at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:229) at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
輸出大小默認是2048,在oozie-site.xml修改配置,重啟 <property> <name>oozie.action.max.output.data</name> <value>204800</value> </property>
CDH中