環境:
hadoop 3.1.1
hive 3.1.0
mysql 8.0.11
安裝前准備:
准備好mysql-connector-java-8.0.12.jar驅動包
上傳hive的tar包並解壓
第一步:
進入hive/conf,拷貝hive-env.sh.template 為hive-env.sh,修改部分為第48,51,54行
1 # Licensed to the Apache Software Foundation (ASF) under one 2 # or more contributor license agreements. See the NOTICE file 3 # distributed with this work for additional information 4 # regarding copyright ownership. The ASF licenses this file 5 # to you under the Apache License, Version 2.0 (the 6 # "License"); you may not use this file except in compliance 7 # with the License. You may obtain a copy of the License at 8 # 9 # http://www.apache.org/licenses/LICENSE-2.0
10 # 11 # Unless required by applicable law or agreed to in writing, software 12 # distributed under the License is distributed on an "AS IS" BASIS, 13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 # See the License for the specific language governing permissions and 15 # limitations under the License. 16
17 # Set Hive and Hadoop environment variables here. These variables can be used 18 # to control the execution of Hive. It should be used by admins to configure 19 # the Hive installation (so that users do not have to set environment variables 20 # or set command line parameters to get correct behavior). 21 # 22 # The hive service being invoked (CLI etc.) is available via the environment 23 # variable SERVICE 24
25
26 # Hive Client memory usage can be an issue if a large number of clients 27 # are running at the same time. The flags below have been useful in
28 # reducing memory usage: 29 # 30 # if [ "$SERVICE" = "cli" ]; then 31 # if [ -z "$DEBUG" ]; then 32 # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
33 # else
34 # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
35 # fi 36 # fi 37
38 # The heap size of the jvm stared by hive shell script can be controlled via: 39 # 40 # export HADOOP_HEAPSIZE=1024
41 # 42 # Larger heap size may be required when running queries over large number of files or partitions. 43 # By default hive shell scripts use a heap size of 256 (MB). Larger heap size would also be 44 # appropriate for hive server. 45
46
47 # Set HADOOP_HOME to point to a specific hadoop install directory 48 HADOOP_HOME=/opt/module/hadoop-3.1.1
49
50 # Hive Configuration Directory can be controlled by: 51 export HIVE_CONF_DIR=/opt/module/hive/conf 52
53 # Folder containing extra libraries required for hive compilation/execution can be controlled by: 54 export HIVE_AUX_JARS_PATH=/opt/module/hive/lib
第二步,拷貝hive-default.xml.template為hive-site.xml,主要是一些連接數據庫的信息,包括用戶名,密碼,注意把hive.metastore.schema.verification設置為false
1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
3 Licensed to the Apache Software Foundation (ASF) under one or more 4 contributor license agreements. See the NOTICE file distributed with 5 this work for additional information regarding copyright ownership. 6 The ASF licenses this file to You under the Apache License, Version 2.0
7 (the "License"); you may not use this file except in compliance with 8 the License. You may obtain a copy of the License at 9
10 http://www.apache.org/licenses/LICENSE-2.0
11
12 Unless required by applicable law or agreed to in writing, software 13 distributed under the License is distributed on an "AS IS" BASIS, 14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 See the License for the specific language governing permissions and 16 limitations under the License. 17 -->
18 <configuration>
19 <property>
20 <name>javax.jdo.option.ConnectionURL</name>
21 <value>jdbc:mysql://127.0.0.1:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT</value>
22 <description>JDBC connect string for a JDBC metastore</description>
23 </property>
24
25 <property>
26 <name>hive.metastore.schema.verification</name>
27 <value>false</value>
28 </property>
29 <property>
30 <name>javax.jdo.option.ConnectionDriverName</name>
31 <value>com.mysql.cj.jdbc.Driver</value>
32 <description>Driver class name for a JDBC metastore</description>
33 </property>
34
35 <property>
36 <name>javax.jdo.option.ConnectionUserName</name>
37 <value>root</value>
38 <description>username to use against metastore database</description>
39 </property>
40
41 <property>
42 <name>javax.jdo.option.ConnectionPassword</name>
43 <value>123</value>
44 <description>password to use against metastore database</description>
45 </property>
46 <property>
47 <name>datanucleus.schema.autoCreateAll</name>
48 <value>true</value>
49 </property>
50 </configuration>
第三步:上傳mysql驅動包到hive/lib目錄下
第四步:在mysql中創建hive數據庫
create database hive;
第五步:進入bin目錄執行(指定元數據庫並進行初始化)
./schematool -dbType mysql -initSchema
第六步:啟動hive(要先啟動hadoop)
./hive
啟動完成后show databases;
注意事項:
1.hive-site.xml中 mysql的驅動名稱為com.mysql.cj.jdbc.Driver
2.xml文檔中javax.jdo.option.ConnectionURL中&要用&替代,一定要指定字符集,時區
3.我已經提前對數據庫的root用戶進行授權
4.如果測試hive插入數據,要在hdfs上創建/user/hive/warehouse路徑
常見問題:
1../schematool -dbType mysql -initSchema提示server code 255之類的是連接數據庫的字符集沒指定,failed的話刪除hive數據庫,重新創建,再次執行此命令即可
2.The server time zone value 'PDT' is unrecognized or represents more than one
時區問題.寫成jdbc:mysql://127.0.0.1:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT即可
參考https://blog.csdn.net/m0_37520980/article/details/80364884