新建hive表:
CREATE TABLE `test`(
`a` timestamp,
`b` struct<t:timestamp>)
--下面可選
[row format delimited fields terminated by '\t']
[STORED AS Parquet]
查看建好的表的結構:
hive> show create table test;
OK
CREATE TABLE `test`(
`a` timestamp,
`b` struct<t:timestamp>)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://ambari.master.com:8020/apps/hive/warehouse/test'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
'numFiles'='2',
'numRows'='2',
'rawDataSize'='78',
'totalSize'='80',
'transient_lastDdlTime'='1532416983')
Time taken: 0.189 seconds, Fetched: 18 row(s)
插入數據:
hive 不支持直接用insert插入復合類型(如test表中struct類型列),可以用以下方式間接插入
insert into test
select unix_timestamp('1970-01-01 08:00:00'),
named_struct('t', cast(unix_timestamp('1970-01-01 08:00:00') as timestamp))
-- named_struct是生成struct類型的函數
from tmp_table limit 1;
補充:
Hive表數據文件如果是parquet類型,struct復合類型里的類型為timestamp的列在存取時候,時區換算會出錯,所以可以用varchar類型代替timestamp。