Hive 的配置說明


Hive的相關配置說明

1、Query and DDL Execution   查詢和DDL操作

  

 
屬性名稱 默認值 更新版 屬性說明
mapred.reduce.tasks
-1 Hive 0.1.0

The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas Hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers.

每個作業的減少任務的默認數量。通常設置為接近可用主機的數量。當mapred.job.tracker是“local”時忽略。 Hadoop默認設置為1,而Hive使用-1作為默認值。通過將此屬性設置為-1,Hive將自動計算出應該是reducers數量。

 
hive.exec.reducers
.bytes.per.reducer
 
 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later  Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

 Size per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

每個reducer的尺寸。 Hive 0.14.0及更早版本的默認值為1 GB,即如果輸入大小為10 GB,則會使用10個reducers。在Hive 0.14.0及更高版本中,默認值為256 MB,即如果輸入大小為1 GB,則將使用4個reducers。

hive.exec.reducers.max
 999 prior to Hive 0.14.0; 1009 in Hive 0.14.0 and later  Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

 Maximum number of reducers that will be used. If the one specified in the configuration property mapred.reduce.tasks is negative, Hive will use this as the maximum number of reducers when automatically determining the number of reducers.

將使用的最大reducers 數量。如果在配置屬性mapred.reduce.tasks中指定的值為負數,那么Hive會在自動確定reducer數時使用這個作為reducer的最大數目。

hive.jar.path
 (empty)  Hive 0.2.0 or earlier

 The location of hive_cli.jar that is used when submitting jobs in a separate jvm.

在單獨的jvm中提交作業時使用的hive_cli.jar的位置。

hive.aux.jars.path
 (empty)  Hive 0.2.0 or earlier

 The location of the plugin jars that contain implementations of user defined functions (UDFs) and SerDes.

包含用戶定義函數(UDF)和SerDes實現的插件JAR的位置。

hive.reloadable
.aux.jars.path
 (empty)  Hive 0.14.0 with HIVE-7553

 The locations of the plugin jars, which can be comma-separated folders or jars. They can be renewed (added, removed, or updated) by executing the Beeline reload command without having to restart HiveServer2. These jars can be used just like the auxiliary classes in hive.aux.jars.path for creating UDFs or SerDes.

plugin jars的位置,可以是逗號分隔的folders or jars。可以通過執行Beeline reload命令來更新(添加,刪除或更新)它們,而無需重新啟動HiveServer2。這些jars 可以像hive.aux.jars.path中用於創建UDF或SerDes的輔助類一樣使用。

hive.exec.scratchdir

 /tmp/${user.name} in Hive 0.2.0 through 0.8.0; 

/tmp/hive-${user.name} in Hive 0.8.1 through 0.14.0; or /tmp/hive in Hive 0.14.0 and later

 Hive 0.2.0; default changed

in 0.8.1 and in 0.14.0 with HIVE-6847 and HIVE-8143

 This directory is used by Hive to store the plans for different map/reduce stages for the query as well as to stored the intermediate outputs of these stages.

該目錄由Hive用來存儲查詢的不同map / reduce階段的計划,以及存儲這些階段的中間輸出。

hive.scratch.dir.permission
 700  Hive 0.12.0 with HIVE-4487

 The permission for the user-specific scratch directories that get created in the root scratch directory. (See hive.exec.scratchdir.)

根臨時目錄中創建的用戶特定臨時目錄的權限。 (請參閱hive.exec.scratchdir。)

hive.exec.local.scratchdir
 /tmp/${user.name}   Hive 0.10.0 with HIVE-1577

 Scratch space for Hive jobs when Hive runs in local mode.  Also see hive.exec.scratchdir.

當Hive以本地模式運行時,Hive作業的臨時空間。另請參閱hive.exec.scratchdir。

hive.hadoop.supports
.splittable
.combineinputformat
 false  Hive 0.6.0 with HIVE-1280

 Whether to combine small input files so that fewer mappers are spawned.

是否結合小的輸入文件,以減少映射器的產生。

hive.map.aggr
 true in Hive 0.3 and later; false in Hive 0.2  Hive 0.2.0

 Whether to use map-side aggregation in Hive Group By queries.

是否在Hive Group By查詢中使用map-side聚合。

hive.groupby.skewindata
 false  Hive 0.3.0

Whether there is skew in data to optimize group by queries. 

數據是否存在偏差以通過查詢來優化組。

hive.groupby.mapaggr
.checkinterval
 100000  Hive 0.3.0

 Number of rows after which size of the grouping keys/aggregation classes is performed.

執行分組鍵/聚合類之后的行數。

hive.new.job.grouping
.set.cardinality
 30  Hive 0.11.0 with HIVE-3552

 Whether a new map-reduce job should be launched for grouping sets/rollups/cubes.

是否應該為分組sets/rollups/cubes啟動新的map-reduce作業。

hive.mapred.local.mem
 0  Hive 0.3.0

 For local mode, memory of the mappers/reducers.

對於本地模式,mappers / redurs的內存。

hive.map.aggr.hash.force
.flush.memory.threshold
 0.9  Hive 0.7.0 with HIVE-1830

 The maximum memory to be used by map-side group aggregation hash table. If the memory usage is higher than this number, force to flush data.

map-side 組聚合 hash 表使用的最大內存。如果內存使用率高於此數字,則強制刷新數據。

hive.map.aggr.hash
.percentmemory
 0.5   Hive 0.2.0

 Portion of total memory to be used by map-side group aggregation hash table.

map-side組聚合哈希表使用的總內存的部分。

hive.map.aggr.hash
.min.reduction
 0.5  Hive 0.4.0

 Hash aggregation will be turned off if the ratio between hash table size and input rows is bigger than this number. Set to 1 to make sure hash aggregation is never turned off.

如果哈希表大小和輸入行之間的比率大於此數字,哈希聚合將被關閉。設置為1以確保哈希聚合從不關閉。

hive.optimize.groupby
  true  Hive 0.5.0

 Whether to enable the bucketed group by from bucketed partitions/tables.

是否通過分區partitions/tables啟用分組。

hive.optimize.countdistinct
 true  Hive 3.0.0 with HIVE-16654

 Whether to rewrite count distinct into 2 stages, i.e., the first stage uses multiple reducers with the count distinct key and the second stage uses a single reducer without key.

是否重寫計數分為兩個階段,即第一階段使用多個reducers 與計數明顯的關鍵並且第二階段使用一個單一的reducer 沒有關鍵。

hive.optimize.remove
.sq_count_check
 false  Hive 3.0.0 with HIVE-16793

 Whether to remove an extra join with sq_count_check UDF for scalar subqueries with constant group by keys. 

是否使用sq_count_check UDF刪除額外的聯接,用於具有按鍵的常量組的標量子查詢。

hive.multigroupby
.singlereducer
true  Hive 0.9.0 with HIVE-2621

 Whether to optimize multi group by query to generate a single M/R  job plan. If the multi group by query has common group by keys, it will be optimized to generate a single M/R job.

是否通過查詢優化多個組以生成單個M / R作業計划。如果通過查詢的多組具有通過鍵的公共組,則它將被優化以生成單個M / R作業。

 

hive.optimize
.index.filter
  false  Hive 0.8.0 with HIVE-1644  

Whether to enable automatic use of indexes.

Note:  See Indexing for more configuration properties related to Hive indexes.

是否啟用自動使用索引。
注:請參閱索引以獲取更多與Hive索引相關的配置屬性。

hive.optimize.ppd
 true  Hive 0.4.0 with HIVE-279, default changed to true in Hive 0.4.0 with HIVE-626  

Whether to enable predicate pushdown (PPD). 

Note: Turn on hive.optimize.index.filter as well to use file format specific indexes with PPD.

是否啟用predicate 下推(PPD)。
注意:打開hive.optimize.index.filter以及在PPD中使用特定於文件格式的索引。

hive.optimize.ppd.storage
 true  Hive 0.7.0

 Whether to push predicates down into storage handlers. Ignored when hive.optimize.ppd is false.

是否將predicates 推入存儲處理程序。 hive.optimize.ppd為false時忽略。

hive.ppd.remove
.duplicatefilters
 true  Hive 0.8.0

 During query optimization, filters may be pushed down in the operator tree. If this config is true, only pushed down filters remain in the operator tree, and the original filter is removed. If this config is false, the original filter is also left in the operator tree at the original place.

在查詢優化期間,過濾器可能會在操作樹中下推。如果此配置為true,則只有按下過濾器保留在操作員樹中,並刪除原始過濾器。如果此配置為false,則原始過濾器也將保留在原始位置的操作員樹中。

hive.ppd.recognizetransivity
 true  Hive 0.8.1

 Whether to transitively replicate predicate filters over equijoin conditions.

是否在等同條件下傳遞復制謂詞過濾器。

hive.join.emit.interval
 1000  Hive 0.2.0

 How many rows in the right-most join operand Hive should buffer before emitting the join result.

在發出連接結果之前,最右側連接操作數Hive應該緩沖多少行。

hive.join.cache.size
  25000  Hive 0.5.0

 How many rows in the joining tables (except the streaming table)
should be cached in memory.

連接表中有多少行(流表除外)應該被緩存在內存中。

hive.mapjoin.bucket
.cache.size
 100  Hive 0.5.0 (replaced by hive.smbjoin.cache.rows in Hive 0.12.0)

 How many values in each key in the map-joined table should be cached in memory.

內存映射表中的每個鍵中有多少個值應該被緩存在內存中。

hive.mapjoin.followby.
map.aggr.hash.percentmemory
 0.3  Hive 0.7.0 with HIVE-1830

 Portion of total memory to be used by map-side group aggregation hash table, when this group by is followed by map join.

總體內存部分將被map-side組聚合哈希表使用,當這個組通過之后是映射連接。

hive.smalltable.filesize 
or 
hive.mapjoin
.smalltable.filesize
 25000000

 Hive 0.7.0 with HIVE-1642hive.smalltable.filesize 

(replaced by 

hive.mapjoin.smalltable.filesize

 in Hive 0.8.1)

 The threshold (in bytes) for the input file size of the small tables; if the file size is smaller than this threshold, it will try to convert the common join into map join.

小表的輸入文件大小的閾值(以字節為單位)如果文件大小小於此閾值,則會嘗試將常用連接轉換為映射連接。

hive.mapjoin.check
.memory.rows
 100000  Hive 0.7.0 with HIVE-1808 and HIVE-1642

 The number means after how many rows processed it needs to check the memory usage.

數字意味着在處理了多少行之后,需要檢查內存使用情況。

hive.ignore.mapjoin.hint
 true  Hive 0.11.0 with HIVE-4042

 Whether Hive ignores the mapjoin hint.

Hive是否忽略mapjoin提示。

hive.smbjoin.cache.rows
 10000

 Hive 0.12.0 with HIVE-4440 (replaces

 hive.mapjoin.bucket.cache.size)

How many rows with the same key value should be cached in memory per sort-merge-bucket joined table.

 每個sort-merge-bucket合並表應該在內存中緩存多少個具有相同鍵值的行。

hive.mapjoin
.optimized.hashtable
 true  Hive 0.14.0 with HIVE-6430 

 Whether Hive should use a memory-optimized hash table for MapJoin. Only works on Tez and Spark, because memory-optimized hash table cannot be serialized. (Spark is supported starting from Hive 1.3.0, with HIVE-11180.)

Hive是否應該為MapJoin使用內存優化的哈希表。只適用於Tez和Spark,因為內存優化的哈希表不能被序列化。 (從Hive 1.3.0開始支持Spark,HIVE-11180。)

hive.hashtable
.initialCapacity
 100000  Hive 0.7.0 with HIVE-1642

 Initial capacity of mapjoin hashtable if statistics are absent, or if hive.hashtable.key.count.adjustment is set to 0.

如果統計信息不存在,或者hive.hashtable.key.count.adjustment設置為0,那么mapjoin散列表的初始容量。

hive.hashtable.loadfactor
 0.75  Hive 0.7.0 with HIVE-1642

 In the process of Mapjoin, the key/value will be held in the hashtable. This value means the load factor for the in-memory hashtable.

在Mapjoin的過程中,key/value 將被保存在散列表中。此值表示內存散列表的加載因子。

hive.hashtable.key
.count.adjustment
 1.0  Hive 0.14.0 with HIVE-7616

 Adjustment to mapjoin hashtable size derived from table and column statistics; the estimate of the number of keys is divided by this value. If the value is 0, statistics are not used and hive.hashtable.initialCapacity is used instead.

調整映射加入來自表和列統計量的哈希表大小;鍵的數量的估計值除以該值。如果值為0,則不使用統計信息,而使用hive.hashtable.initialCapacity。

hive.debug.localtask
 false  Hive 0.7.0 with HIVE-1642  
hive.optimize.skewjoin
 false  Hive 0.6.0

 Whether to enable skew join optimization.  (Also see hive.optimize.skewjoin.compiletime.)

是否啟用偏斜連接優化。 (另請參閱hive.optimize.skewjoin.compiletime。)

hive.skewjoin.key
 100000  Hive 0.6.0

 Determine if we get a skew key in join. If we see more than the specified number of rows with the same key in join operator, we think the key as a skew join key.

確定我們是否在加入中得到一個歪斜鍵。如果我們在連接運算符中看到多於具有相同鍵的指定數量的行,則我們將該鍵看作偏斜連接鍵。

hive.skewjoin
.mapjoin.map.tasks
 10000  Hive 0.6.0

 Determine the number of map task used in the follow up map join job for a skew join. It should be used together with hive.skewjoin.mapjoin.min.split to perform a fine grained control.

確定跟隨map 連接job 中用於偏斜連接的map task 的數量。它應該與hive.skewjoin.mapjoin.min.split一起使用來執行細粒度的控制。

hive.skewjoin
.mapjoin.min.split
 33554432  Hive 0.6.0

 Determine the number of map task at most used in the follow up map join job for a skew join by specifying the minimum split size. It should be used together with hive.skewjoin.mapjoin.map.tasks to perform a fine grained control.

通過指定最小拆分大小,確定最多用於歪斜連接的后續map 連接job 中的map task的數量。它應該與hive.skewjoin.mapjoin.map.tasks一起使用來執行細粒度的控制。

hive.optimize
.skewjoin.compiletime
 false  Hive 0.10.0

 Whether to create a separate plan for skewed keys for the tables in the join. This is based on the skewed keys stored in the metadata. At compile time, the plan is broken into different joins: one for the skewed keys, and the other for the remaining keys. And then, a union is performed for the two joins generated above. So unless the same skewed key is present in both the joined tables, the join for the skewed key will be performed as a map-side join.

是否為連接中的表創建單獨的計划。

hive.optimize.union
.remove
 false  Hive 0.10.0 with HIVE-3276

 Whether to remove the union and push the operators between union and the filesink above union. This avoids an extra scan of the output by union. This is independently useful for union queries, and especially useful when hive.optimize.skewjoin.compiletime is set to true, since an extra union is inserted.

是否移除工會,推動工會和工會之間的工會之間的運營商。這可以避免通過聯合對輸出進行額外的掃描。這對於聯合查詢是獨立有用的,並且在hive.optimize.skewjoin.compiletime設置為true時特別有用,因為插入了一個額外的聯合。

hive.mapred
.supports.subdirectories
 false  Hive 0.10.0 with HIVE-3276

 Whether the version of Hadoop which is running supports sub-directories for tables/partitions. Many Hive optimizations can be applied if the Hadoop version supports sub-directories for tables/partitions. This support was added by MAPREDUCE-1501.

所運行的Hadoop的版本是否支持tables/partitions的子目錄。如果Hadoop版本支持表/分區的子目錄,則可以應用許多Hive優化。這個支持是由MAPREDUCE-1501添加的。

hive.mapred.mode
 

Hive0.x: nonstrict

Hive1.x: nonstrict

Hive2.x: strict (HIVE-12413)

 Hive 0.3.0

 The mode in which the Hive operations are being performed. In strict mode, some risky queries are not allowed to run. For example, full table scans are prevented (see HIVE-10454) and ORDER BY requires a LIMIT clause.

Hive操作正在執行的模式。在嚴格模式下,一些有風險的查詢是不允許運行的。例如,防止全表掃描(見HIVE-10454),ORDER BY需要LIMIT子句。

hive.exec.script.maxerrsize
 100000  Hive 0.2.0

 Maximum number of bytes a script is allowed to emit to standard error (per map-reduce task). This prevents runaway scripts from filling logs partitions to capacity.

腳本允許發送到標准錯誤的最大字節數(每個map-reduce任務)。這樣可以防止失控腳本將日志分區填充到容量。

hive.script.auto.progress
 false  Hive 0.4.0

 Whether Hive Tranform/Map/Reduce Clause should automatically send progress information to TaskTracker to avoid the task getting killed because of inactivity. Hive sends progress information when the script is outputting to stderr. This option removes the need of periodically producing stderr messages, but users should be cautious because this may prevent infinite loops in the scripts to be killed by TaskTracker.

Hive Tranform / Map / Reduce Clause是否應該自動發送進度信息給TaskTracker,以避免由於不活動而導致任務中斷。當腳本輸出到stderr時,Hive發送進度信息。此選項不需要定期生成stderr消息,但用戶應該謹慎,因為這可能會阻止腳本中的無限循環被TaskTracker終止。

hive.exec.script.allow
.partial.consumption
 false  Hive 0.5.0

 When enabled, this option allows a user script to exit successfully without consuming all the data from the standard input.

啟用時,此選項允許用戶腳本成功退出,而不消耗標准輸入中的所有數據。

hive.script.operator
.id.env.var
 HIVE_SCRIPT_OPERATOR_ID  Hive 0.5.0

 Name of the environment variable that holds the unique script operator ID in the user's transform function (the custom mapper/reducer that the user has specified in the query).

在用戶的轉換函數(用戶在查詢中指定的自定義mapper/reducer)中保存唯一腳本運算符ID的環境變量的名稱。

hive.script.operator
.env.blacklist

 hive.txn.valid.txns,

hive.script.operator.env.blacklist

 Hive 0.14.0 with HIVE-8341

 By default all values in the HiveConf object are converted to environment variables of the same name as the key (with '.' (dot) converted to '_' (underscore)) and set as part of the script operator's environment.  However, some values can grow large or are not amenable to translation to environment variables.  This value gives a comma separated list of configuration values that will not be set in the environment when calling a script operator.  By default the valid transaction list is excluded, as it can grow large and is sometimes compressed, which does not translate well to an environment variable.

默認情況下,HiveConf對象中的所有值將被轉換為與該鍵相同名稱的環境變量('.'(點)轉換為'_'(下划線)),並設置為腳本運算符環境的一部分。但是,有些價值可能會變大,或者不適合翻譯成環境變量。該值給出一個逗號分隔的配置值列表,在調用腳本運算符時不會在環境中設置。默認情況下,有效的事務列表被排除,因為它可以變大,有時被壓縮,這不能很好地轉換成環境變量。

hive.exec
.compress.output
 false  Hive 0.2.0

 This controls whether the final outputs of a query (to a local/hdfs file or a Hive table) is compressed. The compression codec and other options are determined from Hadoop configuration variables mapred.output.compress* .

這將控制查詢的最終輸出(到 local/hdfs 文件或Hive表)是否被壓縮。壓縮編解碼器和其他選項由Hadoop配置變量mapred.output.compress *確定。

hive.exec.compress
.intermediate
 false  Hive 0.2.0

 This controls whether intermediate files produced by Hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from Hadoop configuration variables mapred.output.compress*.

這將控制是否壓縮多個map-reduce作業之間由Hive生成的中間文件。壓縮編解碼器和其他選項由Hadoop配置變量mapred.output.compress *確定。

hive.exec.parallel
 false  Hive 0.5.0

 Whether to execute jobs in parallel.  Applies to MapReduce jobs that can run in parallel, for example jobs processing different source tables before a join.  As of Hive 0.14, also applies to move tasks that can run in parallel, for example moving files to insert targets during multi-insert.

是否並行執行作業。適用於可以並行運行的MapReduce作業,例如在聯接之前處理不同源表的作業。從Hive 0.14開始,也適用於移動可以並行運行的任務,例如在多重插入時移動文件以插入目標。

hive.exec.parallel
.thread.number
 8  Hive 0.6.0

 How many jobs at most can be executed in parallel.

最多可以同時執行多少個作業。

hive.exec.rowoffset
 false  Hive 0.8.0

 Whether to provide the row offset virtual column.

是否提供行偏移虛擬列。

hive.counters.group.name
HIVE  Hive 0.13.0 with HIVE-4518  在查詢執行期間使用的計數器的計數器組名。計數器組用於內部Hive變量(CREATED_FILE,FATAL_ERROR等)。
hive.exec.pre.hooks
 (empty)  Hive 0.4.0

 Comma-separated list of pre-execution hooks to be invoked for each statement. A pre-execution hook is specified as the name of a Java class which implements the org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext interface.

為每個語句調用預執行hooks 的逗號分隔列表。預執行hooks 被指定為實現org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext接口的Java類的名稱。

hive.exec.post.hooks
 (empty)  Hive 0.5.0

 Comma-separated list of post-execution hooks to be invoked for each statement. A post-execution hook is specified as the name of a Java class which implements the org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext interface.

為每個語句調用的逗號分隔列表。執行后hook 被指定為實現org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext接口的Java類的名稱。

hive.exec.failure.hooks
 (empty)  Hive 0.8.0

 Comma-separated list of on-failure hooks to be invoked for each statement. An on-failure hook is specified as the name of Java class which implements the org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext interface.

為每個語句調用的逗號分隔列表。一個on-failure hooks 被指定為實現org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext接口的Java類的名字。

hive.merge.mapfiles
 true  Hive 0.4.0

 Merge small files at the end of a map-only job.

在map-only的作業結束時合並小文件。

hive.merge.mapredfiles
 false  Hive 0.4.0

 Merge small files at the end of a map-reduce job.

在map-reduce作業結束時合並小文件。

hive.merge.size.per.task
 256000000  Hive 0.4.0

 Size of merged files at the end of the job.

作業結束時合並文件的大小。

hive.merge.smallfiles
.avgsize
 16000000   Hive 0.5.0

 When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.

當作業的平均輸出文件大小小於這個數字時,Hive將啟動一個額外的map-reduce作業,將輸出文件合並成更大的文件。只有在hive.merge.mapfiles為true的情況下才能執行map-only作業,如果hive.merge.mapredfiles為true,則只執行map-reduce作業。

hive.heartbeat.interval
 1000  Hive 0.4.0

 Send a heartbeat after this interval – used by mapjoin and filter operators.

在此間隔之后發送心跳 - 由mapjoin和過濾器運算符使用。

hive.auto.convert.join
 true in 0.11.0 and later (HIVE-3297)    0.7.0 with HIVE-1642

 Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. (Note that hive-default.xml.template incorrectly gives the default as false in Hive 0.11.0 through 0.13.1.)

無論Hive是否啟用基於輸入文件大小將普通連接轉換為mapjoin的優化。 (請注意,hive-default.xml.template在Hive 0.11.0到0.13.1中錯誤地將默認值設置為false。)

hive.auto.convert
.join.noconditionaltask
 true  0.11.0 with HIVE-3784 (default changed to true with HIVE-4146)

 Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. If this parameter is on, and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than the size specified by hive.auto.convert.join.noconditionaltask.size, the join is directly converted to a mapjoin (there is no conditional task).

無論Hive是否啟用基於輸入文件大小將普通連接轉換為mapjoin的優化。如果此參數處於打開狀態,並且n-way聯接的表/分區的n-1大小總和小於hive.auto.convert.join.noconditionaltask.size指定的大小,則會直接轉換聯接到一個mapjoin(沒有條件任務)。

hive.auto.convert.join
.noconditionaltask.size
 10000000  0.11.0 with HIVE-3784

 If hive.auto.convert.join.noconditionaltask is off, this parameter does not take effect. However, if it is on, and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than this size, the join is directly converted to a mapjoin (there is no conditional task). The default is 10MB.

如果hive.auto.convert.join.noconditionaltask關閉,則此參數不起作用。但是,如果它處於打開狀態,並且n路連接的表/分區的n-1大小之和小於此大小,則連接將直接轉換為map連接(不存在任何條件任務)。默認值是10MB。

hive.auto.convert
.join.use.nonstaged
 false

 changed to false 

with HIVE-6749 also in 0.13.0

 For conditional joins, if input stream from a small alias can be directly applied to the join operator without filtering or projection, the alias need not be pre-staged in the distributed cache via a mapred local task. Currently, this is not working with vectorization or Tez execution engine.

對於條件連接,如果來自小別名的輸入流可以直接應用於連接運算符而不進行過濾或投影,則別名不需要通過映射的本地任務在分布式緩存中預先設置。目前,這不適用於矢量化或Tez執行引擎。

hive.merge.nway.joins
 true  2.2.0 with HIVE-15655 

 For multiple joins on the same condition, merge joins together into a single join operator. This is useful in the case of large shuffle joins to avoid a reshuffle phase. Disabling this in Tez will often provide a faster join algorithm in case of left outer joins or a general Snowflake schema.

對於相同條件下的多個連接,將連接合並成一個連接運算符。這對於大型混洗連接以避免重新洗牌階段很有用。在Tez中禁用這個功能通常會提供一個更快的連接算法,用於左外連接或一般Snowflake模式。

hive.udtf.auto.progress
 false  Hive 0.5.0

 Whether Hive should automatically send progress information to TaskTracker when using UDTF's to prevent the task getting killed because of inactivity. Users should be cautious because this may prevent TaskTracker from killing tasks with infinite loops.

Hive是否應該在使用UDTF時自動發送進度信息給TaskTracker,以防止由於不活動而導致任務中斷。用戶應該謹慎,因為這可能會阻止TaskTracker殺死具有無限循環的任務。

hive.exec.counters
.pull.interval
 1000  Hive 0.6.0

 The interval with which to poll the JobTracker for the counters the running job. The smaller it is the more load there will be on the jobtracker, the higher it is the less granular the caught will be.

輪詢正在運行的作業的計數器的JobTracker的時間間隔。工作追蹤器上的負載越小,捕獲的粒度越小。

hive.optimize
.bucketingsorting
 true  Hive 0.11.0 with HIVE-4240  

If hive.enforce.bucketing or hive.enforce.sorting is true, don't create a reducer for enforcing bucketing/sorting for queries of the form:

insert overwrite table T2 select * from T1;

如果hive.enforce.bucketing或hive.enforce.sorting為true,則不要為表單查詢創建一個reducer 來強制執行分段/排序:

hive.optimize
.reducededuplication
 true  Hive 0.6.0

 Remove extra map-reduce jobs if the data is already clustered by the same key which needs to be used again. This should always be set to true. Since it is a new feature, it has been made configurable.

如果數據已被同一個密鑰重新使用,則需要移除額外的map-reduce作業。這應該始終設置為true。由於它是一個新功能,因此它已經被配置。

hive.optimize.correlation
  false  Hive 0.12.0 with HIVE-2206

 Exploit intra-query correlations. For details see the Correlation Optimizer design document.

利用查詢內相關性。有關詳細信息,請參閱關聯優化器設計文檔。

hive.optimize.limittranspose
 false  Hive 2.0.0 with HIVE-11684, modified by HIVE-11775

 Whether to push a limit through left/right outer join or union. If the value is true and the size of the outer input is reduced enough (as specified in hive.optimize.limittranspose.reductionpercentage 

and hive.optimize.limittranspose.reductiontuples), the limit is pushed to the outer input or union; to remain semantically correct, the limit is kept on top of the join or the union too.

是否通過left/right外連接或聯合來推動限制。如果該值為true,並且外部輸入的大小足夠小(如hive.optimize.limittranspose.reductionpercentage和hive.optimize.limittranspose.reductiontuples中所指定的那樣),則將限制推送到外部輸入或聯合;為了保持語義上的正確性,限制保持在聯接或聯合之上。

hive.optimize.limittranspose
.reductionpercentage
 1.0  Hive 2.0.0 with HIVE-11684, modified by HIVE-11775

 When hive.optimize.limittranspose is true, this variable specifies the minimal percentage (fractional) reduction of the size of the outer input of the join or input of the union that the optimizer should get in order to apply the rule.

當hive.optimize.limittranspose為true時,此變量指定優化程序為了應用規則而應該獲得的聯合或聯合輸入的外部輸入大小的最小百分比(小數)減少。

hive.optimize.limittranspose
.reductiontuples
 0  Hive 2.0.0 with HIVE-11684, modified by HIVE-11775

 When hive.optimize.limittranspose is true, this variable specifies the minimal reduction in the number of tuples of the outer input of the join or input of the union that the optimizer should get in order to apply the rule.

當hive.optimize.limittranspose為true時,此變量指定優化程序為了應用規則而應該獲得的聯合或聯合輸入的外部輸入的元組數量的最小減少。

hive.optimize.filter
.stats.reduction
  false  Hive 2.1.0 with HIVE-13269

 Whether to simplify comparison expressions in filter operators using column stats.

是否使用列統計量簡化過濾器運算符中的比較表達式。

hive.optimize.sort
.dynamic.partition

 false in Hive 0.14.0

and later (HIVE-8151)

 Hive 0.13.0 with HIVE-6455

 When enabled, dynamic partitioning column will be globally sorted. This way we can keep only one record writer open for each partition value in the reducer thereby reducing the memory pressure on reducers.

啟用時,動態分區列將全局排序。通過這種方式,我們只能在reducer中為每個分區值保留一個記錄寫入程序,從而減少對reducer的內存壓力。

hive.cbo.enable
 true in Hive 1.1.0 and later (HIVE-8395)  Hive 0.14.0 with HIVE-5775 and HIVE-7946

 When true, the cost based optimizer, which uses the Calcite framework, will be enabled.

如果為true,則將啟用使用Calcite框架的基於成本的優化器。

hive.cbo.returnpath
.hiveop
 false  Hive 1.2.0 with HIVE-9581 and HIVE-9795 

 When true, this optimization to CBO Logical plan will add rule to introduce not null filtering on join keys.  Controls Calcite plan to Hive operator conversion.  Overrides hive.optimize.remove.identity.project when set to false.

如果為true,則對CBO邏輯計划的此優化將添加規則,以在連接鍵上引入不空過濾。控制方案計划Hive操作員轉換。設置為false時覆蓋hive.optimize.remove.identity.project。

hive.cbo.cnf.maxnodes
 -1   Hive 2.1.1 with HIVE-14021

 When converting to conjunctive normal form (CNF), fail if the expression exceeds the specified threshold; the threshold is expressed in terms of the number of nodes (leaves and interior nodes). The default, -1, does not set up a threshold.

當轉換為連接范式(CNF)時,如果表達式超過指定的閾值,則失敗;閾值用節點(葉子和內部節點)的數量表示。默認值-1不設置閾值。

hive.optimize.null.scan
 true  Hive 0.14.0 with HIVE-7385

 When true, this optimization will try to not scan any rows from tables which can be determined at query compile time to not generate any rows (e.g., where 1 = 2, where false, limit 0 etc.).

如果為true,則此優化將嘗試不掃描可在查詢編譯時確定的表中的任何行,以不生成任何行(例如,其中1 = 2,其中false,限制0等)。

hive.exec.dynamic.partition
 true in Hive 0.9.0 and later  Hive 0.6.0

 Whether or not to allow dynamic partitions in DML/DDL.

是否允許在DML / DDL中使用動態分區。

hive.exec.dynamic
.partition.mode
 strict  Hive 0.6.0

 In strict mode, the user must specify at least one static partition in case the user accidentally overwrites all partitions. In nonstrict mode all partitions are allowed to be dynamic.

在嚴格模式下,用戶必須指定至少一個靜態分區,以防用戶意外覆蓋所有分區。在非嚴格模式下,所有分區都是動態的。

hive.exec.max
.dynamic.partitions
 1000  Hive 0.6.0

 Maximum number of dynamic partitions allowed to be created in total.

允許創建的最大動態分區總數。

hive.exec.max.dynamic
.partitions.pernode
 100  Hive 0.6.0

 Maximum number of dynamic partitions allowed to be created in each mapper/reducer node.

允許在每個 mapper/reducer 節點中創建的最大動態分區數量。

hive.exec.max.created.files
 100000  Hive 0.7.0

 Maximum number of HDFS files created by all mappers/reducers in a MapReduce job.

MapReduce作業中所有mappers/reducers創建的HDFS文件的最大數量。

hive.exec.default
.partition.name
  __HIVE_DEFAULT_PARTITION__  Hive 0.6.0

 The default partition name in case the dynamic partition column value is null/empty string or any other values that cannot be escaped. This value must not contain any special character used in HDFS URI (e.g., ':', '%', '/' etc). The user has to be aware that the dynamic partition value should not contain this value to avoid confusions.

動態分區列值為空/空字符串或任何其他不能轉義的值時的默認分區名稱。該值不得包含HDFS URI中使用的任何特殊字符(例如':','%','/'等)。用戶必須意識到動態分區值不應包含此值以避免混淆。

hive.fetch.output.serde

 org.apache.hadoop.hive

.serde2.DelimitedJSONSerDe

 Hive 0.7.0

 The SerDe used by FetchTask to serialize the fetch output.

SerDe使用SerDe來序列化讀取輸出。

hive.exec.mode.local.auto
  false  Hive 0.7.0 with HIVE-1408

 Lets Hive determine whether to run in local mode automatically.

讓Hive確定是否以本地模式自動運行。

hive.exec.mode.local.auto
.inputbytes.max
 134217728  Hive 0.7.0 with HIVE-1408

 When hive.exec.mode.local.auto is true, input bytes should be less than this for local mode.

當hive.exec.mode.local.auto為true時,對於本地模式,輸入字節應該小於此值。

hive.exec.mode.local
.auto.input.files.max
 4  Hive 0.9.0 with HIVE-2651

 When hive.exec.mode.local.auto is true, the number of tasks should be less than this for local mode.

當hive.exec.mode.local.auto為true時,本地模式下的任務數應該小於此值。

hive.exec.drop
.ignorenonexistent
 true  Hive 0.7.0 with HIVE-1856 and HIVE-1858

 Do not report an error if DROP TABLE/VIEW/PARTITION/INDEX/TEMPORARY FUNCTION specifies a non-existent table/view. Also applies to permanent functionsas of Hive 0.13.0.

如果DROP TABLE / VIEW / PARTITION / INDEX / TEMPORARY FUNCTION指定不存在的表/視圖,則不報告錯誤。也適用於Hive 0.13.0的永久功能。

hive.exec.show.job
.failure.debug.info
 true  Hive 0.7.0

 If a job fails, whether to provide a link in the CLI to the task with the most failures, along with debugging hints if applicable.

如果作業失敗,是否將CLI中的鏈接提供給發生故障最多的任務,同時提供調試提示(如果適用)。

hive.auto.progress.timeout
 0  Hive 0.7.0

 How long to run autoprogressor for the script/UDTF operators (in seconds). Set to 0 for forever.

為腳本/ UDTF操作員運行自動搜索器需要多長時間(以秒為單位)。永遠設為0。

hive.table.parameters.default
 (empty)  Hive 0.7.0

 Default property values for newly created tables.

新建表的默認屬性值。

hive.variable.substitute
 true  Hive 0.7.0  This enables substitution using syntax like ${var${system:var} and ${env:var}.
hive.error.on.empty.partition
  false  Hive 0.7.0

 Whether to throw an exception if dynamic partition insert generates empty results.

如果動態分區插入生成空結果,是否拋出異常。

hive.exim.uri
.scheme.whitelist
 hdfs,pfile,file in Hive 2.2.0 and later  default changed in Hive 2.2.0 with HIVE-15151

 A comma separated list of acceptable URI schemes for import and export.

用於導入和導出的可接受URI方案的逗號分隔列表。

hive.limit.row.max.size
 100000  Hive 0.8.0

 When trying a smaller subset of data for simple LIMIT, how much size we need to guarantee each row to have at least.

當為簡單的LIMIT嘗試更小的數據子集時,我們需要多少大小來保證每行至少有一個。

hive.limit.optimize
.limit.file
 10  Hive 0.8.0

 When trying a smaller subset of data for simple LIMIT, maximum number of files we can sample.

當為簡單的LIMIT嘗試更小的數據子集時,我們可以采樣的文件數量最多。

hive.limit.optimize.enable
 false  Hive 0.8.0

 Whether to enable to optimization to trying a smaller subset of data for simple LIMIT first.

是否啟用優化,以便首先嘗試更小的數據子集以獲得簡單的LIMIT。

hive.limit.optimize
.fetch.max
 50000  Hive 0.8.0

 Maximum number of rows allowed for a smaller subset of data for simple LIMIT, if it is a fetch query. Insert queries are not restricted by this limit.

如果是提取查詢,則允許簡單LIMIT數據的較小子集的最大行數。插入查詢不受此限制的限制。

hive.rework.mapredwork
 false  Hive 0.8.0

 Should rework the mapred work or not. This is first introduced by SymlinkTextInputFormat to replace symlink files with real paths at compile time.

是否應該返工mapred工作。這首先由SymlinkTextInputFormat引入,以在編譯時用實際路徑替換符號鏈接文件。

hive.sample.seednumber
 0  Hive 0.8.0

 A number used to percentage sampling. By changing this number, user will change the subsets of data sampled.

一個數字用於百分比抽樣。通過改變這個數字,用戶將改變采樣數據的子集。

hive.autogen.columnalias
.prefix.label
  _c  Hive 0.8.0

 String used as a prefix when auto generating column alias. By default the prefix label will be appended with a column position number to form the column alias. Auto generation would happen if an aggregate function is used in a select clause without an explicit alias.

自動生成列別名時用作前綴的字符串。默認情況下,前綴標簽將附加一個列位置編號以形成列別名。如果在沒有顯式別名的select子句中使用聚合函數,則會自動生成。

hive.autogen.columnalias
.prefix.includefuncname
 false  Hive 0.8.0

 Whether to include function name in the column alias auto generated by Hive.

是否將函數名稱包含在由Hive自動生成的列別名中。

hive.exec.perf.logger

 org.apache.hadoop.hive

.ql.log.PerfLogger

 Hive 0.8.0

 The class responsible logging client side performance metrics. Must be a subclass of org.apache.hadoop.hive.ql.log.PerfLogger.

負責記錄客戶端性能指標的類。必須是org.apache.hadoop.hive.ql.log.PerfLogger的子類。

hive.start.cleanup.scratchdir
 false  Hive 1.3.0 with HIVE-10415

 To clean up the Hive scratch directory while starting the Hive server (or HiveServer2). This is not an option for a multi-user environment since it will accidentally remove the scratch directory in use.

在啟動Hive服務器(或HiveServer2)時清理Hive暫存目錄。這不是多用戶環境的選項,因為它會意外地刪除正在使用的暫存目錄。

hive.scratchdir.lock
 false  Hive 1.3.0 and 2.1.0 (but not 2.0.x) with HIVE-13429  

When true, holds a lock file in the scratch directory. If a Hive process dies and accidentally leaves a dangling scratchdir behind, the cleardanglingscratchdir tool will remove it.When false, does not create a lock file and therefore the cleardanglingscratchdir tool cannot remove any dangling scratch directories.

如果為true,則在暫存目錄中保存一個鎖定文件。如果一個Hive進程死了,意外地把一個懸掛的scratchdir留在后面,那么cleardanglingscratchdir工具將會把它移除。如果false,不會創建一個鎖文件,因此cleardanglingscratchdir工具不能刪除任何懸掛的scratch目錄。

hive.output.file.extension
 (empty)  Hive 0.8.1

 String used as a file extension for output files. If not set, defaults to the codec extension for text files (e.g. ".gz"), or no extension otherwise.

用作輸出文件的文件擴展名的字符串。如果未設置,則默認為文本文件的編解碼器擴展名(例如“.gz”),否則為無擴展名。

hive.insert.into
.multilevel.dirs
 false  Hive 0.8.1

 Whether to insert into multilevel nested directories like "insert directory '/HIVEFT25686/chinna/' from table".

是否插入多級嵌套目錄,如“插入目錄”/ HIVEFT25686 / chinna /'從表“。

hive.conf.validation
 true   Hive 0.10.0 with HIVE-2848

 Enables type checking for registered Hive configurations.

為注冊的Hive配置啟用類型檢查。

hive.fetch.task.conversion
  more in Hive 0.14.0 and later   Hive 0.14.0 with HIVE-7397

 Some select queries can be converted to a single FETCH task, minimizing latency. Currently the query should be single sourced not having any subquery and should not have any aggregations or distincts (which incur RS – ReduceSinkOperator, requiring a MapReduce task), lateral views and joins.

一些select查詢可以轉換為一個FETCH任務,最大限度地減少延遲。目前查詢應該是單一來源,沒有任何子查詢,不應該有任何聚合或區別(這會產生RS  -  ReduceSinkOperator,需要一個MapReduce任務),橫向視圖和連接。

Supported values are none, minimal and more.

0. none:  Disable hive.fetch.task.conversion (value added in Hive 0.14.0 with HIVE-8389)
1. minimal:  SELECT *, FILTER on partition columns (WHERE and HAVING clauses), LIMIT only
2. more:  SELECT, FILTER, LIMIT only (including TABLESAMPLE, virtual columns)

"more" can take any kind of expressions in the SELECT clause, including UDFs.
(UDTFs and lateral views are not yet supported – see HIVE-5718.)

hive.map.groupby.sorted
 Hive 2.0 and later: true (HIVE-12325)  Hive 0.10.0 with HIVE-3432

 If the bucketing/sorting properties of the table exactly match the grouping key, whether to perform the group by in the mapper by using BucketizedHiveInputFormat. The only downside to this is that it limits the number of mappers to the number of files.

如果表的桶/排序屬性與分組鍵完全匹配,是否使用BucketizedHiveInputFormat在映射器中執行組。唯一的缺點是它限制了映射器的數量到文件的數量。

hive.groupby.orderby
.position.alias
 false  Hive 0.11.0 with HIVE-581

 Whether to enable using Column Position Alias in GROUP BY and ORDER BY clauses of queries (deprecated as of Hive 2.2.0; use hive.groupby.position.alias and hive.orderby.position.alias instead).

是否啟用在GROUP BY和ORDER BY子句中使用列位置別名(不推薦使用Hive 2.2.0;改為使用hive.groupby.position.alias和hive.orderby.position.alias)。

hive.groupby.position.alias
 false  Hive 2.2.0 with HIVE-15797

 Whether to enable using Column Position Alias in GROUP BY.

是否在GROUP BY中啟用列位置別名

hive.orderby.position.alias
 true  Hive 2.2.0 with HIVE-15797

 Whether to enable using Column Position Alias in ORDER BY.

是否在ORDER BY中啟用使用列位置別名。

hive.fetch.task.aggr
 false  Hive 0.12.0 with HIVE-4002

 Aggregation queries with no group-by clause (for example, select count(*) from src) execute final aggregations in a single reduce task. If this parameter is set to true, Hive delegates the final aggregation stage to a fetch task, possibly decreasing the query time.

不帶分組的子句的聚合查詢(例如,select count(*) from src)在單個reduce任務中執行最終聚合。如果此參數設置為true,則Hive會將最終聚合階段委托給一個提取任務,可能會縮短查詢時間。

hive.fetch.task.conversion
.threshold
  -1 in Hive 0.13.0 and 0.13.1, 1073741824 (1 GB) in Hive 0.14.0 and later    Hive 0.13.0

 Input threshold (in bytes) for applying hive.fetch.task.conversion. If target table is native, input length is calculated by summation of file lengths. If it's not native, the storage handler for the table can optionally implement the org.apache.hadoop.hive.ql.metadata.InputEstimator interface. A negative threshold means hive.fetch.task.conversion is applied without any input length threshold.

用於應用hive.fetch.task.conversion的輸入閾值(以字節為單位)。如果目標表是本地的,則輸入長度是通過文件長度的總和來計算的。如果不是本地的,那么表的存儲處理器可以選擇實現org.apache.hadoop.hive.ql.metadata.InputEstimator接口。否定閾值意味着應用hive.fetch.task.conversion時沒有任何輸入長度閾值。

hive.limit.pushdown
.memory.usage
  -1  Hive 0.12.0 with HIVE-3562

 The maximum memory to be used for hash in RS operator for top K selection. The default value "-1" means no limit.

用於頂層K選擇的RS運算符中用於散列的最大內存。默認值“-1”表示沒有限制。

hive.cache.expr.evaluation
 true  Hive 0.12.0 with HIVE-4209

 If true, the evaluation result of a deterministic expression referenced twice or more will be cached. For example, in a filter condition like "... where key + 10 > 10 or key + 10 = 0" the expression "key + 10" will be evaluated/cached once and reused for the following expression ("key + 10 = 0"). Currently, this is applied only to expressions in select or filter operators.

如果為真,則將被高速緩存兩次或更多次引用的確定性表達式的評估結果。例如,在“...其中鍵+ 10>或鍵+ 10 = 0”的過濾條件中,表達式“鍵+ 10”將被評估/高速緩存一次,並被重新用於下面的表達式(“鍵+ 10 = 0' )。目前,這只適用於選擇或過濾器運算符中的表達式。

hive.resultset.use.unique
.column.names
  true  Hive 0.13.0 with HIVE-6687

 Make column names unique in the result set by qualifying column names with table alias if needed. Table alias will be added to column names for queries of type "select *" or if query explicitly uses table alias "select r1.x..".

如果需要,通過限定具有表別名的列名使結果集中的列名唯一。表別名將被添加到類型為“select *”的查詢的列名稱中,或者如果查詢顯式使用表別名“select r1.x ..”。

hive.support.quoted
.identifiers
 column  Hive 0.13.0 with HIVE-6013

 Whether to use quoted identifiers.  Value can be "none" or "column".

是否使用帶引號的標識符。值可以是“無”或“列”。

hive.plan.serialization.format
 kryo  Hive 0.13.0 with HIVE-1511

 Query plan format serialization between client and task nodes. Two supported values are kryo and javaXML. Kryo is the default.

客戶端和任務節點之間的查詢計划格式序列化。兩個支持的值是kryo和javaXML。 Kryo是默認的。

hive.exec.check
.crossproducts
 true   Hive 0.13.0 with HIVE-6643

 Check if a query plan contains a cross product. If there is one, output a warning to the session's console.

檢查查詢計划是否包含跨產品。如果有,向會話控制台輸出警告。

hive.display.partition
.cols.separately
 true  Hive 0.13.0 with HIVE-6689

 In older Hive versions (0.10 and earlier) no distinction was made between partition columns or non-partition columns while displaying columns in DESCRIBE TABLE. From version 0.12 onwards, they are displayed separately. This flag will let you get the old behavior, if desired. See test-case in patch for HIVE-6689.

在較早的Hive版本(0.10及更早版本)中,在DESCRIBE TABLE中顯示列時,分區列或非分區列之間沒有區別。從版本0.12開始,它們分開顯示。如果需要,這個標志會讓你得到舊的行為。在補丁中查看HIVE-6689的測試案例。

hive.optimize
.sampling.orderby
  false  Hive 0.12.0 with HIVE-1402

 Uses sampling on order-by clause for parallel execution.

使用order by子句進行並行執行采樣。

hive.optimize.sampling
.orderby.number
 1000  Hive 0.12.0 with HIVE-1402

 With hive.optimize.sampling.orderby=true, total number of samples to be obtained to calculate partition keys.

通過hive.optimize.sampling.orderby = true,可以獲得計算分區鍵的樣本總數。

hive.optimize.sampling
.orderby.percent
 0.1  Hive 0.12.0 with HIVE-1402

 With hive.optimize.sampling.orderby=true, probability with which a row will be chosen.

hive.optimize.sampling.orderby = true,將選擇一行的概率。

hive.compat
 0.12  Hive 0.13.0 with HIVE-6012

 Enable (configurable) deprecated behaviors of arithmetic operations by setting the desired level of backward compatibility. The default value gives backward-compatible return types for numeric operations. Other supported release numbers give newer behavior for numeric operations, for example 0.13 gives the more SQL compliant return types introduced in HIVE-5356.

通過設置所需級別的向后兼容性來啟用(可配置的)算術運算的棄用行為。默認值為數字操作提供向后兼容的返回類型。其他支持的版本號為數字操作提供了更新的行為,例如0.13給出了HIVE-5356中引入的更多SQL兼容返回類型。

hive.optimize.constant
.propagation
 true  Hive 0.14.0 with HIVE-5771

 Whether to enable the constant propagation optimizer.

是否啟用常量傳播優化器。

hive.entity.capture
.transform
 false  Hive 1.1.0 with HIVE-8938

 Enable capturing compiler read entity of transform URI which can be introspected in the semantic and exec hooks.

啟用捕獲可以在語義和可執行鈎子中自省的轉換URI的編譯器讀取實體。

hive.support.sql11
.reserved.keywords
 true  Hive 1.2.0 with HIVE-6617

 Whether to enable support for SQL2011 reserved keywords. When enabled, will support (part of) SQL2011 reserved keywords.

是否啟用對SQL2011保留關鍵字的支持。啟用時,將支持(部分)SQL2011保留關鍵字。

hive.log.explain.output
 false  1.1.0 with HIVE-8600

 When enabled, will log EXPLAIN EXTENDED output for the query at log4j INFO level and in WebUI / Drilldown / Query Plan.

啟用后,將在log4j INFO級別和WebUI /鑽取/查詢計划中記錄EXPLAIN EXTENDED輸出。

hive.explain.user
 false  Hive 1.2.0 with HIVE-9780

 Whether to show explain result at user level. When enabled, will log EXPLAIN output for the query at user level. (Tez only.  For Spark, see hive.spark.explain.user.)

是否在用戶級別顯示解釋結果。啟用后,將在用戶級別為查詢記錄EXPLAIN輸出。 (僅限於Tez,對於Spark,請參閱hive.spark.explain.user。)

hive.typecheck.on.insert
 true  Hive 0.12.0 with HIVE-5297 for insert partition

 Whether to check, convert, and normalize partition value specified in partition specification to conform to the partition column type.

是否檢查,轉換和歸一化分區規范中指定的分區值以符合分區列類型。

hive.exec.temporary
.table.storage
 default   Hive 1.1.0 with HIVE-7313

 Define the storage policy for temporary tables. Choices between memory, ssd and default. See HDFS Storage Types and Storage Policies.

定義臨時表的存儲策略。內存,ssd和默認值之間的選擇。請參閱HDFS存儲類型和存儲策略。

hive.optimize.distinct
.rewrite
 true  Hive 1.2.0 with HIVE-10568

 When applicable, this optimization rewrites distinct aggregates from a single-stage to multi-stage aggregation. This may not be optimal in all cases. Ideally, whether to trigger it or not should be a cost-based decision. Until Hive formalizes the cost model for this, this is config driven.

在適用的情況下,此優化會將不同的聚合從單階段聚合重寫為多階段聚合。在所有情況下,這可能不是最佳的。理想情況下,是否觸發它應該是基於成本的決定。在Hive正式確定成本模型之前,這是配置驅動的。

hive.optimize.point.lookup
 true  Hive 2.0.0 with HIVE-11461

 Whether to transform OR clauses in Filter operators into IN clauses.

是否將Filter運算符中的OR子句轉換為IN子句。

hive.optimize.point
.lookup.min
 31  Hive 2.0.0 with HIVE-11573

 Minimum number of OR clauses needed to transform into IN clauses.

需要轉換成IN子句的最少OR子句數。

hive.allow.udf.load
.on.demand
 false  Hive 2.1.0 with HIVE-13596

 Whether enable loading UDFs from metastore on demand; this is mostly relevant for HS2 and was the default behavior before Hive 1.2.

是否能夠根據需要從元存儲裝載UDF;這主要與HS2相關,並且是Hive 1.2之前的默認行為。

hive.async.log.enabled
 true  Hive 2.1.0 with HIVE-13027

 Whether to enable Log4j2's asynchronous logging. Asynchronous logging can give significant performance improvement as logging will be handled in a separate thread that uses the LMAX disruptor queue for buffering log messages.

是否啟用Log4j2的異步日志記錄。異步日志記錄可以顯着提高性能,因為日志記錄將在使用LMAX干擾程序隊列緩沖日志消息的單獨線程中處理。

hive.msck.repair.batch.size
 0  Hive 2.2.0 with HIVE-12077

 To run the MSCK REPAIR TABLE command batch-wise. If there is a large number of untracked partitions, by configuring a value to the property it will execute in batches internally. The default value of the property is zero, which means it will execute all the partitions at once.

分批運行MSCK REPAIR TABLE命令。如果存在大量未跟蹤的分區,則通過為該屬性配置一個值,它將在內部批量執行。該屬性的默認值是零,這意味着它將一次執行所有的分區。

hive.exec.copyfile
.maxnumfiles
 1   Hive 2.3.0 with HIVE-14864

 Maximum number of files Hive uses to do sequential HDFS copies between directories. Distributed copies (distcp) will be used instead for larger numbers of files so that copies can be done faster.

Hive用來在目錄之間執行順序HDFS副本的最大文件數。分布式副本(distcp)將用於更大數量的文件,以便可以更快地完成副本。

hive.exec.copyfile.maxsize
  32 megabytes  Hive 1.1.0 with HIVE-8750

 Maximum file size (in bytes) that Hive uses to do single HDFS copies between directories. Distributed copies (distcp) will be used instead for bigger files so that copies can be done faster.

Hive用於在目錄之間執行單個HDFS副本的最大文件大小(以字節為單位)。分布式副本(distcp)將用於更大的文件,以便可以更快地完成副本。

hive.exec.stagingdir
 hive-staging  Hive 1.1.0 with HIVE-8750

 Directory name that will be created inside table locations in order to support HDFS encryption. This is replaces hive.exec.scratchdir for query results with the exception of read-only tables. In all cases hive.exec.scratchdir is still used for other temporary files, such as job plans.

將在表格位置內創建的目錄名稱,以支持HDFS加密。除了只讀表之外,這將替換查詢結果的hive.exec.scratchdir。在所有情況下,hive.exec.scratchdir仍然用於其他臨時文件,例如作業計划。

hive.query.lifetime.hooks
 (empty)  Hive 2.3.0 with HIVE-14340

 A comma separated list of hooks which implement QueryLifeTimeHook. These will be triggered before/after query compilation and before/after query execution, in the order specified. As of Hive 3.0.0 (HIVE-16363), this config can be used to specify implementations of QueryLifeTimeHookWithParseHooks. If they are specified then they will be invoked in the same places as QueryLifeTimeHooks and will be invoked during pre and post query parsing.

用逗號分隔的實現QueryLifeTimeHook的鈎子列表。這些將在查詢編譯之前/之后和查詢執行之前/之后以指定的順序觸發。從Hive 3.0.0(HIVE-16363)開始,這個配置可以用來指定QueryLifeTimeHookWithParseHooks的實現。如果它們被指定,那么它們將在與QueryLifeTimeHooks相同的地方被調用,並且將在前后查詢分析期間被調用。

hive.remove.orderby
.in.subquery
 true  Hive 3.0.0 with HIVE-6348

 If set to true, order/sort by without limit in subqueries and views will be removed.

如果設置為true,則在子查詢和視圖中排序/排序將被刪除。

 

2、SerDes and I/O

  2.1 SerDes

 
屬性 默認值 更新版本 屬性說明
hive.script.serde

org.apache.hadoop

.hive.serde2.lazy

.LazySimpleSerDe

Hive 0.4.0

The default SerDe for transmitting input data to and reading output data from the user scripts.

用於將輸入數據傳輸到用戶腳本並從用戶腳本讀取輸出數據的默認SerDe。

hive.script.recordreader

 org.apache.hadoop

.hive.ql.exec

.TextRecordReader

 Hive 0.4.0

 The default record reader for reading data from the user scripts.

用於從用戶腳本讀取數據的默認記錄閱讀器。

hive.script.recordwriter

 org.apache.hadoop

.hive.ql.exec

.TextRecordWriter

 Hive 0.5.0

 The default record writer for writing data to the user scripts.

用於將數據寫入用戶腳本的默認記錄寫入器。

hive.default.serde

 org.apache.hadoop

.hive.serde2.lazy

.LazySimpleSerDe

 Hive 0.14 with HIVE-5976  

The default SerDe Hive will use for storage formats that do not specify a SerDe.  Storage formats that currently do not specify a SerDe include 'TextFile, RcFile'.  

See Registration of Native SerDes for more information for storage formats and SerDes.

默認的SerDe Hive將用於不指定SerDe的存儲格式。目前沒有指定SerDe的存儲格式包括“TextFile,RcFile”。
有關存儲格式和SerDes的更多信息,請參閱本機SerDes的注冊。

hive.lazysimple
.extended_boolean_literal
 false   Hive 0.14 with HIVE-3635  LazySimpleSerDe uses this property to determine if it treats 'T', 't', 'F', 'f', '1', and '0' as extended, legal boolean literals, in addition to 'TRUE' and 'FALSE'. The default is false, which means only 'TRUE' and 'FALSE' are treated as legal boolean literals.

 

  2.2 I/O

 
屬性 默認值 更新版本 屬性說明
       


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM