hive數據庫的哪些函數操作是否走MR

本文轉載自查看原文 2018-03-14 22:22 985 hadoop專題

平時我們用的HIVE 我們都知道 select * from table_name 不走MR 直接走HTTP

hive 0.10.0為了執行效率考慮，簡單的查詢，就是只是select，不帶count,sum,group by這樣的，都不走map/reduce，直接讀取hdfs文件進行filter過濾。這樣做的好處就是不新開mr任務，執行效率要提高不少，但是不好的地方就是用戶界面不友好，有時候數據量大還是要等很長時間，但是又沒有任何返回。

改這個很簡單，在hive-site.xml里面有個配置參數叫

hive.fetch.task.conversion

將這個參數設置為more，簡單查詢就不走map/reduce了，設置為minimal，就任何簡單select都會走map/reduce

造200W數據格式為

下面驗證下like是否走MR

1、等值比較: =

select * from tp_200w_test where name='測試 '-等值條件下 --不走MR

2、模糊比較: LIKE

1）百分號在后面

select * from tp_200w_test where name like '測%' and address like '江蘇%' --不走MR

2）百分號在前面

select * from tp_200w_test where name like '%試' --不走MR

3）百分號在2邊

select * from tp_200w_test where address like '%物聯%' --不走MR

4)百分號加中間

select * from tp_200w_test where address like '%物%聯%' --不走MR

3、不等值比較: <>

select * from tp_200w_test where id <>1 --不走MR

4、小於比較:< ,大於比較> 小於等於<= 大於等於>=

select * from tp_200w_test where id <2 --不走MR

select * from tp_200w_test where id >2 --不走MR

select * from tp_200w_test where id >=2 --不走MR

select * from tp_200w_test where id <=2 --不走MR

5、空值判斷is null, is not null

select * from tp_200w_test where id is null --不走MR

select * from tp_200w_test where id is not null --不走MR

6、JAVA的LIKE操作: RLIKE

select * from tp_200w_test where id rlike '^f.*r$' --不走MR

對於集合統計函數

1） count，sum ,min ,avg ,max 函數 都走MR，不在一一寫

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hive數據庫操作 Hive筆記之數據庫操作 Hive 數據庫日期處理函數 HBase（六）HBase整合Hive，數據的備份與MR操作HBase 數據庫操作語法錯誤（SQL syntax error）之兩步走 Spark操作MySQL，Hive並寫入MySQL數據庫用openrowset函數操作遠程數據庫 Hive 表操作（HIVE的數據存儲、數據庫、表、分區、分桶） hive數據庫基本查詢 Hive與數據庫的異同