Hive創建HBase,ES外部表


1、創建HBase外部表

CREATE EXTERNAL TABLE `ods_women`(
  `rowkey` string COMMENT 'from deserializer', 
  `article` string COMMENT 'from deserializer', 
  `url` string COMMENT 'from deserializer', 
  `web` string COMMENT 'from deserializer', 
  `keyword` string COMMENT 'from deserializer', 
  `acquire_time` string COMMENT 'from deserializer', 
  `article_time` string COMMENT 'from deserializer', 
  `calculate_time` string COMMENT 'from deserializer', 
  `title` string COMMENT 'from deserializer', 
  `english_industry` string COMMENT 'from deserializer')
STORED BY 
  'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ( 
'hbase.columns.mapping'=':key,info:article_word,info:article_url,info:website,info:chinese_keyword,info:acquire_time,info:article_time,info:calculate_time,info:article_title,info:english_industry')
TBLPROPERTIES ( 'hbase.table.name'='test:ods_women');

2、創建es外部表

1)下載相關jar包

https://www.elastic.co/cn/downloads/past-releases#es-hadoop

2)hive命令行輸入

add jar /home/jar/elasticsearch-hadoop-5.5.3/dist/elasticsearch-hadoop-5.5.3.jar;

3)創建表

drop table if exists dw_women_article_core;
create external table dw_women_article_core(
md5id string,
article_id string,
keyword string,
search_keyword string,
keyword_weight bigint,
article_title string,
article_content string,
web string,
article_date string,
status bigint,
keyword_push string
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES('es.nodes' = '192.168.2.14:9200',
'es.index.auto.create' = 'true',
'es.resource' = 'app_knowledgegraph_new/app_women_article_core',
'es.mapping.id' = 'md5id',
'es.mapping.names' = 'md5id:md5id,article_id:article_id,keyword:keyword,search_keyword:search_keyword,keyword_weight:keyword_weight,article_title:article_title,web:web,article_date:article_date,status:status,keyword_push:keyword_push,article_content:article_content',
'es.nodes.wan.only' = 'true');

es.index.auto.create (default yes)

Whether elasticsearch-hadoop should create an index (if its missing) when writing data to Elasticsearch or fail.

es.nodes.wan.only (default false)

Whether the connector is used against an Elasticsearch instance in a cloud/restricted environment over the WAN, such as Amazon Web Services. In this mode, the connector disables discovery and only connects through the declared es.nodes during all operations, including reads and writes. Note that in this mode, performance is highly affected.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM