ThriftServer是一個JDBC/ODBC接口,用戶可以通過JDBC/ODBC連接ThriftServer來訪問SparkSQL的數據。ThriftServer在啟動的時候,會啟動了一個SparkSQL的應用程序,而通過JDBC/ODBC連接進來的客戶端共同分享這個SparkSQL應用程序的資源,也就是說不同的用戶之間可以共享數據;ThriftServer啟動時還開啟一個偵聽器,等待JDBC客戶端的連接和提交查詢。所以,在配置ThriftServer的時候,至少要配置ThriftServer的主機名和端口,如果要使用Hive數據的話,還要提供Hive Metastore的uris。
具體配置
1、修改$SPARK_HOME/conf目錄下的hive-site.xml文件,具體配置如下
<?xml version="1.0"?> <configuration> <!-- Hive Metastore 配置 --> <property> <name>hive.metastore.uris</name> <value>thrift://m1:9083</value> <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property> <property> <name>hive.server2.thrift.min.worker.threads</name> <value>5</value> <description>Minimum number of Thrift worker threads</description> </property> <property> <name>hive.server2.thrift.max.worker.threads</name> <value>500</value> <description>Maximum number of Thrift worker threads</description> </property> <!-- Thrift Server服務器綁定的端口 --> <property> <name>hive.server2.thrift.port</name> <value>10000</value> <description>Port number of HiveServer2 Thrift interface. Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description> </property> <!-- Thrift Server服務器地址 --> <property> <name>hive.server2.thrift.bind.host</name> <value>m1</value> <description>Bind host on which to run the HiveServer2 Thrift interface.Can be overridden by setting$HIVE_SERVER2_THRIFT_BIND_HOST</description> </property> </configuration>
2、啟動hive metastore
$nohup hive --service metastore > metastore.log 2>&1 &
3、啟動spark
./sbin/start-all.sh
4、啟動Thrift Server
./sbin/start-thriftserver.sh --master spark://m1:7077
5、客戶端連接Thrift Server
啟動beeline
./bin/beeline
連接服務
!connect jdbc:hive2://m1:10000
連接成功后就可以在這里使用HQL進行操作了