Impala隊列內存參數分析


同步發布在csdn上

問題

對Impala隊列內存的幾個參數分析了下,歡迎指正

隊列資源池的幾個內存配置

  1. Maximum Query Memory Limit

    某個隊列資源池,一個查詢在一個Impala節點上下執行需要的最小內存

  2. Minimum Query Memory Limit

    某個隊列資源池,一個查詢在一個Impala節點上下執行需要的最大內存

  3. 最大內存

    可用於此池中執行的所有查詢的最大內存

給一個Impala隊列提交查詢時,Impala如何判斷是否接受查詢請求

實驗sql

 set request_pool = hqueue;
 select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;

查詢sql分析

[ip:21000] testdb> explain select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: explain select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
+------------------------------------------------------------------------------------+
| Explain String                                                                     |
+------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3                         |
| Per-Host Resource Estimates: Memory=256MB                                          |
| WARNING: The following tables are missing relevant table and/or column statistics. |
| testdb.testtable                                              |
|                                                                                    |
| PLAN-ROOT SINK                                                                     |
| |                                                                                  |
| 02:MERGING-EXCHANGE [UNPARTITIONED]                                                |
| |  offset: 5                                                                       |
| |  order by: acctset_code ASC                                                      |
| |  limit: 5                                                                        |
| |                                                                                  |
| 01:TOP-N [LIMIT=10]                                                                |
| |  order by: acctset_code ASC                                                      |
| |  row-size=22B cardinality=10                                                     |
| |                                                                                  |
| 00:SCAN HDFS [testdb.testtable]                               |
|    partitions=138/138 files=140 size=808.62MB                                      |
|    predicates: acctset_code = '00001'                                              |
|    row-size=22B cardinality=16                                                     |
+------------------------------------------------------------------------------------+

注意的地方:這里面有個單節點需要內存值256M,不過Impala估算的不一定准確。

實驗1

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maximum Query Memory Limit
root.hqueue 500M 260M 270M

提交結果:

[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 10:54:55 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=f54d764cf100d474:a89eec5c00000000
ERROR: Rejected query from pool root.hqueue: request memory needed 780.00 MB is greater than pool max mem resources 500.00 MB.

猜測是因為:260M(查詢最小內存) * 3 =780M > 500M

實驗2

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maxmum Query Memory Limit
root.hqueue 500M 250M 270M

提交結果:

[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 10:58:28 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=39423b17b20dc603:66c4de7400000000
ERROR: Rejected query from pool root.hqueue: request memory needed 768.23 MB is greater than pool max mem resources 500.00 MB.

猜測是因為:256M(查詢計划里面估計的單節點內存) * 3 = 768M > 500M,綜合實驗1和實驗2,估計Impala在判斷查詢是否會超內存時,對估計值和Minimum Query Memory Limit參數,會有個 Max(估計值,Minimum Query Memory Limit)操作,在實驗1中,即Max(256M,260M),實驗2中,即Max(256,250)。

實驗3

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maxmum Query Memory Limit
root.hqueue 500M 250M 252M

提交結果:

[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 11:09:42 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=e24e74d387c201b5:9e72143600000000
ERROR: Rejected query from pool root.hqueue: request memory needed 756.00 MB is greater than pool max mem resources 500.00 MB

猜測是因為:252M * 3 = 756M > 500M,結合實驗2,估計Impala在判斷查詢是否會超內存時,對於Maxmum Query Memory Limit參數,會有個Min操作,即Min(Max(估計值,Minimum Query Memory Limit),Maxmum Query Memory Limit),在本例中,即Min(Max(256M,250M),252M)

實驗4

mem_limit:指定查詢每個節點需要的內存

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maxmum Query Memory Limit
root.hqueue 500M 100M 200M
[ip:21000] testdb> set mem_limit=170M;
MEM_LIMIT set to 170M
[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 13:53:31 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=ba4fa4a44d2dac9d:b24a60d600000000
ERROR: Rejected query from pool root.hqueue: request memory needed 510.00 MB is greater than pool max mem resources 500.00 MB.
[ip:21000] testdb> set mem_limit=210M;
MEM_LIMIT set to 210M
[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 13:54:07 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=ca49acba3c002727:2d69557a00000000
ERROR: Rejected query from pool root.hqueue: request memory needed 600.00 MB is greater than pool max mem resources 500.00 MB

分析:mem_limit=170M時,Min(Max(170,100),200) * 3 = 510M > 500M;mem_limit=210M時,Min(Max(210,100),200) * 3 = 600M > 500;猜測,指定mem_limit時,Impala會使用mem_limit值來代替自己估計的內存使用值,並結合Minimum Query Memory Limit和Maxmum Query Memory Limit來判斷內存是否會超過最大內存,從而決定是否拒絕查詢請求。

實驗5

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maxmum Query Memory Limit
root.hqueue 500M 39M 39M
[ip:21000] testdb> select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5;
Query: select tally_id, acctset_code  from testtable where acctset_code='00001'order by acctset_code limit 5 offset 5
Query submitted at: 2020-06-23 15:26:42 (Coordinator: http://ip:25000)
Query progress can be monitored at: http://ip:25000/query_plan?query_id=234ca270d3731d06:9980e6fd00000000
ERROR: Rejected query from pool root.hqueue: minimum memory reservation is greater than memory available to the query for buffer reservations. Memory reservation needed given the current plan: 8.00 MB. Adjust either the mem_limit or the pool config (max-query-mem-limit, min-query-mem-limit) for the query to allow the query memory limit to be at least 40.00 MB. Note that changing the mem_limit may also change the plan. See the query profile for more information about the per-node memory requirements.

以下配置,查詢成功提交並執行

Left-Aligned Left-Aligned Left-Aligned Left-Aligned
隊列名稱 最大內存 Minimum Query Memory Limit Maxmum Query Memory Limit
root.hqueue 500M 40M 40M

分析:max-query-mem-limit, min-query-mem-limit,不能設置的太小,測試環境中,單個節點最少需要40M

結論

  1. 當查詢指定mem_limit,以下條件拒絕提交查詢,報內存不夠

    __Min(Max(mem_limit,Minimum Query Memory Limit),Maxmum Query Memory Limit) * 節點數 __> 最大內存

  2. 未指定mem_limit,以下條件拒絕提交查詢,報內存不夠,估計值可以通過explain獲得,不過Impala估計的不准

    __Min(Max(估計值,Minimum Query Memory Limit),Maxmum Query Memory Limit) * 節點數 __> 最大內存

  3. max-query-mem-limit, min-query-mem-limit,不能設置的太小,測試環境中,單個節點最少需要40M

建議

  1. 配置Maxmum Query Memory Limit * 節點數 <= 最大內存,查詢應該不會被reject
  2. 若隊列資源池中沒有配置Minimum Query Memory Limit和Maxmum Query Memory Limit參數,那么從之前的結論也可以看出,Impala會根據__估計值 * 節點數__ 是否大於最大內存來判斷是否拒絕該查詢,但因為Impala估計出的單節點內存上限值很不准確,所以這種情況,可以通過 set mem_limit = XXM,人為設置一個合理的大小,后續Impala會根據__mem_limit__ * 節點數來判斷是否會超過最大內存


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM