本文轉載自查看原文 2018-08-14 13:07 1354 Hive

0.創建數據庫

　　hive>create table hive.test(id int);

　　hive>load data local inpath '/home/hyxy/test_order.txt' into table hive.test;

1.order by 全局排序

　　hive>select * from hive.test order by id;

2.sort by 局部排序(reduce)

　　hive>set mapreduce.job.reduces=3;

　　hive>select * from hive.test sort by id;

索引

　　1.創建索引

　　　　hive>create index test_id_index on table hive.test(id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' with deferred rebuild;

　　2.默認查詢索引表:在hive數據庫的INDEX_TABLE下自動生成索引表

　　　　hive>select * from hive.hive__test_test_test_id_index__;

　　　　說明：默認生成的索引表無數據，空白狀態；

　　　　索引表有三個字段：　　

　　　　　　id：表示index字段

　　　　　　_bucketname：表示數據所在的location位置

　　　　　　_offsets：表示當前數據所處的偏移量

　　3.重構index表，目的生成index數據

　　　　hive>alter index test_id_index on hive.test rebuild;

　　　　hive>select * from hive.hive__test_test_id_index__;

　　4.刪除索引

　　　　hive>drop index stocks_id_index on table stocks;

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 hive中Sort By，Order By，Cluster By，Distribute By，Group By的區別 hive 中 Order by, Sort by ,Dristribute by,Cluster By 的作用和用法 hive partition order by 和 group by 的區別 Hibernate的集合映射與sort、order-by屬性 R語言三個函數sort();rank();order() HiveQL之Sort by、Distribute by、Cluster by、Order By詳解 R軟件中排序:sort()，rank()，order() Hive_內部排序(Sort By) 12-order by和group by 原理和優化 sort by 倒敘 [R] R dataframe 中對列使用sort或者order的注意