1、問題描述:
(1)問題示例:
[Hadoop@master TestDir]$ sqoop import --connect jdbc:mysql://master:3306/source?useSSL=false --username Hive --password ****** --table customer --hive-import --hive-table rds.customer --hive-overwrite
2021-11-05 20:13:59,638 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2021-11-05 20:13:59,664 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2021-11-05 20:13:59,665 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
2021-11-05 20:13:59,665 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
2021-11-05 20:13:59,743 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
2021-11-05 20:13:59,748 INFO tool.CodeGenTool: Beginning code generation
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2021-11-05 20:14:00,022 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer` AS t LIMIT 1
2021-11-05 20:14:00,369 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer` AS t LIMIT 1
2021-11-05 20:14:00,377 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/grid/Hadoop/hadoop-3.3.1
......(此處有省略)
2021-11-05 20:15:51,825 INFO mapreduce.ImportJobBase: Transferred 807 bytes in 102.2706 seconds (7.8908 bytes/sec)
2021-11-05 20:15:51,828 INFO mapreduce.ImportJobBase: Retrieved 14 records.
2021-11-05 20:15:51,828 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table customer
2021-11-05 20:15:51,847 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer` AS t LIMIT 1
2021-11-05 20:15:52,005 INFO hive.HiveImport: Loading uploaded data into Hive
2021-11-05 20:15:52,009 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
2021-11-05 20:15:52,010 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:537)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
... 12 more
(2)問題綜述:
問題核心所在:上述示例標紅部分,即:
2021-11-05 20:15:52,009 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
2021-11-05 20:15:52,010 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
上述顯示:存在兩個問題。
2、問題剖析:
(1)參考:
參考1:https://www.cnblogs.com/drl-blogs/p/11086865.html
參考2:https://blog.csdn.net/lianghecai52171314/article/details/104454332
(2)根據參考資料和問題示例報錯部分,進行分析和排查:
對於問題1:“ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.”,推測是HIVE環境未配置好,找不到環境變量HIVE_CONF_DIR,尤其是Hive中的jar工具庫。出現示例問題的報錯是可能的,存在未正確配置好環境變量HIVE_CONF_DIR而導致該種問題的可能性。但是根據筆者Hive正常使用情況,以及從最終正確結果反推,出現該種錯誤不是根本原因。筆者親測驗證,各種設置改環境變量值,都未排除示例問題。
對於問題2:“ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf”,是導致示例問題報錯出現的根本原因。筆者根據參考資料——參考2,進行排雷,求得正解。
示例報錯的原因是:
Sqoop工具缺少了hive-common-*.*.*.jar,將Hive的lib目錄下的hive-common-3.1.2.jar拷貝至Sqoop的lib目錄下即可。
3、解決方案:
方案1:錯誤排雷
(1)問題排查:查看Hive安裝路徑下lib中的hive-common-*.*.*.jar中是否還有類“HiveConf.class”。
[Hadoop@master lib]$ ls hive-common*
hive-common-3.1.2.jar
[Hadoop@master lib]$ jar tf hive-common-3.1.2.jar | grep HiveConf.class
org/apache/hadoop/hive/conf/HiveConf.class
從上面結果可知,Hive安裝路徑lib中是存在Hive-common-*.jar工具。
(2)配置環境,將hive安裝路徑lib添加至hadoop環境HADOOP_CLASSPATH中。
此法未解決筆者本示例的問題,但存在引起該示例問題的可能性(筆者未驗證)。
方案2:正確排雷
(1)拷貝hive-common-3.1.2.jar
將Hive的lib目錄下的hive-common-3.1.2.jar拷貝至Sqoop的lib目錄下即可。
(2)二次測試:看到下面最后1行,即說明解決了示例問題,Sqoop成功將MySQL全覆蓋傳入Hive數據庫中。
[Hadoop@master TestDir]$ sqoop import --connect jdbc:mysql://master:3306/source?useSSL=false --username Hive --password ******* --table customer --hive-import --hive-table rds.customer --hive-overwrite
2021-11-06 00:41:17,492 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2021-11-06 00:41:17,516 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2021-11-06 00:41:17,516 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
2021-11-06 00:41:17,516 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
2021-11-06 00:41:17,579 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
2021-11-06 00:41:17,584 INFO tool.CodeGenTool: Beginning code generation
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2021-11-06 00:41:17,831 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer` AS t LIMIT 1
2021-11-06 00:41:17,874 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer` AS t LIMIT 1
2021-11-06 00:41:17,880 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/Hadoop/Hadoop/hadoop-3.3.1
......(此處有省略)
2021-11-06 00:42:20,755 INFO hive.HiveImport: OK
2021-11-06 00:42:20,756 INFO hive.HiveImport: 2021-11-06 00:42:20,755 INFO [40671dec-9d11-4d68-b514-9df6685af945 main] ql.Driver (SessionState.java:printInfo(1227)) - OK
2021-11-06 00:42:20,756 INFO hive.HiveImport: 2021-11-06 00:42:20,755 INFO [40671dec-9d11-4d68-b514-9df6685af945 main] lockmgr.DbTxnManager (DbTxnManager.java:stopHeartbeat(846)) - Stopped heartbeat for query: grid_20211106004218_3df662fa-e868-477a-a8db-eb11ce9534fe
2021-11-06 00:42:20,854 INFO hive.HiveImport: Time taken: 2.147 seconds
2021-11-06 00:42:20,855 INFO hive.HiveImport: 2021-11-06 00:42:20,854 INFO [40671dec-9d11-4d68-b514-9df6685af945 main] CliDriver (SessionState.java:printInfo(1227)) - Time taken: 2.147 seconds
2021-11-06 00:42:20,855 INFO hive.HiveImport: 2021-11-06 00:42:20,855 INFO [40671dec-9d11-4d68-b514-9df6685af945 main] conf.HiveConf (HiveConf.java:getLogIdVar(5040)) - Using the default value passed in for log id: 40671dec-9d11-4d68-b514-9df6685af945
2021-11-06 00:42:20,856 INFO hive.HiveImport: 2021-11-06 00:42:20,855 INFO [40671dec-9d11-4d68-b514-9df6685af945 main] session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main
2021-11-06 00:42:20,856 INFO hive.HiveImport: 2021-11-06 00:42:20,856 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5040)) - Using the default value passed in for log id: 40671dec-9d11-4d68-b514-9df6685af945
2021-11-06 00:42:20,907 INFO hive.HiveImport: 2021-11-06 00:42:20,905 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /user/hive/tmp/grid/40671dec-9d11-4d68-b514-9df6685af945 on fs with scheme hdfs
2021-11-06 00:42:20,907 INFO hive.HiveImport: 2021-11-06 00:42:20,906 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /tmp/hive/Local/40671dec-9d11-4d68-b514-9df6685af945 on fs with scheme file
2021-11-06 00:42:20,907 INFO hive.HiveImport: 2021-11-06 00:42:20,906 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 1
2021-11-06 00:42:21,038 INFO hive.HiveImport: Hive import complete.