用hive讀取es里面的數據,建表。時間類型的數據不能接受的問題
問題描述:spark讀取指定索引/類型的數據,其中有自定義格式的日期數據,讀取該日期時報異常,日期定義格式:"estime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
1 CREATE EXTERNAL TABLE esjson.app_phone_device( 2 id string, 3 into_code string, 4 order_no string, 5 cust_code string, 6 estime date, 7 ttimestamp string, 8 vcard string, 9 vphone string, 10 appType string, 11 appNo string, 12 appVersion string, 13 custCode string, 14 pid string, 15 modelType string, 16 operatSystem string, 17 systemVersion string 18 ) 19 STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 20 TBLPROPERTIES( 21 'es.nodes' = '172.18.100.187:9200', 22 'es.index.auto.create' = 'false', 23 'es.index.read.missing.as.empty' = 'true', 24 'es.resource' = 'app_phone_device/app_phone_device', 25 'es.read.metadata' = 'true', 26 'es.mapping.names'= '\ 27 id:_metadata._id,\ 28 into_code:applyNum,\ 29 order_no:orderNo,\ 30 cust_code:customerNum,\ 31 estime:cast(esTime as date),\ 32 ttimestamp:timestamp,\ 33 vcard:vcard,\ 34 vphone:vphone,\ 35 appType:header.appType,\ 36 appNo:header.appNo,\ 37 appVersion:header.appVersion,\ 38 custCode:body.custCode,\ 39 pid:body.pid,\ 40 modelType:body.modelType,\ 41 operatSystem:body.operatSystem,\ 42 systemVersion:body.systemVersion');
hive建表想用date類型接收,把es的date類型用
cast(esTime as date) 轉換一下,結果查詢hive表的estime數據列為null
找了大半天也沒找到怎么解決,轉換思路,用String類型接受estime。增加一項配置es.mapping.date.rich=false
公司用的CDH 直接在hue里增加配置項
CDH永久更改詞參數
hvie的配置
看了好資料 spark讀取es時間類型的字段也會遇到該問題
在命令行提交時設置spark.es.mapping.date.rich為false生效,可以不解析為date,直接返回string。
或者
val sparkConf = new SparkConf().setAppName("esspark").setMaster("local[2]") sparkConf.set("es.nodes", "10.3.162.202") sparkConf.set("es.port", "9200") sparkConf.set("es.mapping.date.rich", "false")
如果是hadoop讀寫es
//如果是api操作可以set這個值 import org.apache.hadoop.conf.Configuration; Configuration conf = new Configuration(); conf.set("es.mapping.date.rich", "false");
引用
https://www.elastic.co/guide/en/elasticsearch/hadoop/master/configuration.html
在官方找到這個配置項
Whether to create a rich Date
like object for Date
fields in Elasticsearch or returned them as primitives (String
or long
). By default this is true. The actual object type is based on the library used; noteable exception being Map/Reduce which provides no built-in Date
object and as such LongWritable
and Text
are returned regardless of this setting.