【原創】大叔經驗分享(33)hive select count為0


hive建表后直接將數據文件拷貝到table目錄下,select * 可以查到數據,但是select count(1) 一直返回0,這個是因為hive中有個配置

hive.stats.autogather=true

Enables automated gathering of table-level statistics for newly created tables and table partitions, such as tables created with the INSERT OVERWRITE statement. The parameter does not produce column-level statistics, such as those generated by CBO. If disabled, administrators must manually generate the table-level statistics for newly generated tables and table partitions with the ANALYZE TABLE statement.

可以通過describe來查看table的統計信息

DESCRIBE EXTENDED $table_name;

有個配置控制是否使用talbe的統計信息

hive.compute.query.using.stats=true

Instructs Hive to use statistics when generating query plans

很多人建議的處理方法是

set hive.compute.query.using.stats=false;

正解應該是

ANALYZE TABLE $table_name COMPUTE STATISTICS;

ANALYZE TABLE $table_name partition(p=$1) COMPUTE STATISTICS;

即重新計算統計信息


參考:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hive-performance-tuning/content/ch_cost-based-optimizer.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM