Hive/Phoenix + Druid + JdbcTemplate 在 Spring Boot 下的整合
一.POM依賴
作者的hadoop集群環境為:
HDFS,YARN,MapReduce2 : 2.7.3Hive : 1.2.1000
HBase : 1.1.2
注:phoenix版本依賴性較強,請注意不同發行版之間的差異(直接從集群服務器上獲取jar包最為可靠)
- <properties>
- <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
- <spring-data-hadoop.version>2.4.0.RELEASE</spring-data-hadoop.version>
- <hive.version>1.2.1</hive.version>
- <phoenix-client.version>4.7</phoenix-client.version>
- <druid.version>1.0.27</druid.version>
- </properties>
- <dependencies>
- <dependency>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-jdbc</artifactId>
- </dependency>
- <dependency>
- <groupId>org.springframework.data</groupId>
- <artifactId>spring-data-hadoop</artifactId>
- <version>${spring-data-hadoop.version}</version>
- </dependency>
- <dependency>
- <groupId>org.apache.hive</groupId>
- <artifactId>hive-jdbc</artifactId>
- <version>${hive.version}</version>
- </dependency>
- <dependency>
- <groupId>org.apache.phoenix</groupId>
- <artifactId>phoenix-client</artifactId>
- <version>${phoenix-client.version}</version>
- </dependency>
- <dependency>
- <groupId>com.alibaba</groupId>
- <artifactId>druid</artifactId>
- <version>${druid.version}</version>
- </dependency>
- </dependencies>
二.spring boot 配置文件
因為spring boot 是默認且推薦采用yaml和properties配置文件的方式。因此,作者在這里采用yaml方式為例:
application.yml:
- # hive 數據源自定義配置
- hive:
- url: jdbc:hive2://192.168.61.43:10000/default
- type: com.alibaba.druid.pool.DruidDataSource
- driver-class-name: org.apache.hive.jdbc.HiveDriver
- username: hive
- password: hive
- # phoenix 數據源自定義配置
- phoenix:
- enable: true
- url: jdbc:phoenix:192.168.61.43
- type: com.alibaba.druid.pool.DruidDataSource
- driver-class-name: org.apache.phoenix.jdbc.PhoenixDriver
- username:
- password:
- default-auto-commit: true
當然,druid還有很多其它可選配置,請讀者自行斟酌:
- max-active: 100
- initialSize: 1
- maxWait: 60000
- minIdle: 1
- timeBetweenEvictionRunsMillis: 60000
- minEvictableIdleTimeMillis: 300000
- testWhileIdle: true
- testOnBorrow: false
- testOnReturn: false
- poolPreparedStatements: true
- maxOpenPreparedStatements: 50
三.spring boot 配置Bean實現
因為上述配置信息為自定義的信息,spring boot 的 auto configuration 並不能完全理解編碼者的意圖,因此我們要手動創造數據源Bean:
Hive:
- /**
- * hive數據源配置
- * @author chenty
- *
- */
- @Configuration
- public class HiveDataSource {
- @Autowired
- private Environment env;
- @Bean(name = "hiveJdbcDataSource")
- @Qualifier("hiveJdbcDataSource")
- public DataSource dataSource() {
- DruidDataSource dataSource = new DruidDataSource();
- dataSource.setUrl(env.getProperty("hive.url"));
- dataSource.setDriverClassName(env.getProperty("hive.driver-class-name"));
- dataSource.setUsername(env.getProperty("hive.username"));
- dataSource.setPassword(env.getProperty("hive.password"));
- return dataSource;
- }
- @Bean(name = "hiveJdbcTemplate")
- public JdbcTemplate hiveJdbcTemplate(@Qualifier("hiveJdbcDataSource") DataSource dataSource) {
- return new JdbcTemplate(dataSource);
- }
- }
Phoenix:
- /**
- * phoenix數據源配置
- * @author chenty
- *
- */
- @Configuration
- public class PhoenixDataSource {
- @Autowired
- private Environment env;
- @Bean(name = "phoenixJdbcDataSource")
- @Qualifier("phoenixJdbcDataSource")
- public DataSource dataSource() {
- DruidDataSource dataSource = new DruidDataSource();
- dataSource.setUrl(env.getProperty("phoenix.url"));
- dataSource.setDriverClassName(env.getProperty("phoenix.driver-class-name"));
- dataSource.setUsername(env.getProperty("phoenix.username"));//phoenix的用戶名默認為空
- dataSource.setPassword(env.getProperty("phoenix.password"));//phoenix的密碼默認為空
- dataSource.setDefaultAutoCommit(Boolean.valueOf(env.getProperty("phoenix.default-auto-commit")));
- return dataSource;
- }
- @Bean(name = "phoenixJdbcTemplate")
- public JdbcTemplate phoenixJdbcTemplate(@Qualifier("phoenixJdbcDataSource") DataSource dataSource) {
- return new JdbcTemplate(dataSource);
- }
- }
四.數據源測試
接下來我們只需在測試類中,注入 hive/phoenix 的 JdbcTemplate,即可實現 hive/phoenix 的數據交互:
Hive:
- @RunWith(SpringJUnit4ClassRunner.class)
- @SpringApplicationConfiguration(HiveServiceApplication.class)
- public class MainTest {
- @Autowired
- @Qualifier("hiveJdbcTemplate")
- JdbcTemplate hiveJdbcTemplate;
- @Test
- public void DataSourceTest() {
- // create table
- StringBuffer sql = new StringBuffer("create table IF NOT EXISTS ");
- sql.append("HIVE_TEST1 ");
- sql.append("(KEY INT, VALUE STRING) ");
- sql.append("PARTITIONED BY (S_TIME DATE)"); // 分區存儲
- sql.append("ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' "); // 定義分隔符
- sql.append("STORED AS TEXTFILE"); // 作為文本存儲
- // drop table
- // StringBuffer sql = new StringBuffer("DROP TABLE IF EXISTS ");
- // sql.append("HIVE_TEST1");
- hiveJdbcTemplate.execute(sql.toString());
- }
- }
Phoenix:
- @RunWith(SpringJUnit4ClassRunner.class)
- @SpringApplicationConfiguration(HBaseServiceApplication.class)
- public class MainTest {
- @Autowired
- @Qualifier("phoenixJdbcTemplate")
- JdbcTemplate phoenixJdbcTemplate;
- @Test
- public void DataSourceTest() {
- //phoenix
- phoenixJdbcTemplate.execute("create table IF NOT EXISTS PHOENIX_TEST2 (ID INTEGER not null primary key, Name varchar(20),Age INTEGER)");
- }
- }
五.傳統方式
雖然 spring boot 本身是不推薦傳統的xml配置的,但是實際生產過程中因各種客觀因素,導致有時我們不得不引入傳統的xml形式的配置文件。因此針對 hive/phoenix 如果用xml配置文件,並且在spring boot 下如何實現再做下簡單的介紹:
application.xml:
- <!-- 配置HiveTemplate -->
- <bean id="hiveTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
- <constructor-arg ref="hiveDataSource"/>
- <qualifier value="hiveTemplate"/>
- </bean>
- <bean id="hiveDataSource" class="com.alibaba.druid.pool.DruidDataSource">
- <property name="driverClassName" value="org.apache.hive.jdbc.HiveDriver"/>
- <property name="url" value="jdbc:hive2://172.20.36.212:10000/default"/>
- <property name="username" value="hive"/>
- <property name="password" value="hive"/>
- <!-- 初始化連接大小 -->
- <property name="initialSize" value="0" />
- <!-- 連接池最大使用連接數量 -->
- <property name="maxActive" value="1500" />
- <!-- 連接池最小空閑 -->
- <property name="minIdle" value="0" />
- <!-- 獲取連接最大等待時間 -->
- <property name="maxWait" value="60000" />
- </bean>
- <!-- 配置PhoenixTemplate -->
- <bean id="phoenixTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
- <constructor-arg ref="phoenixDataSource"/>
- <qualifier value="phoenixJdbcTemplate"/>
- </bean>
- <bean id="phoenixDataSource" class="com.alibaba.druid.pool.DruidDataSource">
- <property name="driverClassName" value="org.apache.phoenix.jdbc.PhoenixDriver"/>
- <property name="url" value="jdbc:phoenix:172.20.36.212"/>
- <!-- 初始化連接大小 -->
- <property name="initialSize" value="0" />
- <!-- 連接池最大使用連接數量 -->
- <property name="maxActive" value="1500" />
- <!-- 連接池最小空閑 -->
- <property name="minIdle" value="0" />
- <!-- 獲取連接最大等待時間 -->
- <property name="maxWait" value="60000" />
- <!--因為Phoenix進行數據更改時不會自動的commit,必須要添加defaultAutoCommit屬性,否則會導致數據無法提交的情況-->
- <property name="defaultAutoCommit" value="true"/>
- </bean>
實現測試:
有了xml配置,我們只需在上述第四步驟測試類的類定義上加入如下注解,即可實現xml配置文件信息的加載:
- @ImportResource({"classpath:application.xml","..."})
注意:配置文件中bean的名字要與注入注解的名字一致