[Hive] - Hive參數含義詳解

本文轉載自查看原文 2015-11-02 17:33 4725 hive

　　hive中參數分為三類，第一種system環境變量信息，是系統環境變量信息；第二種是env環境變量信息，是當前用戶環境變量信息；第三種是hive參數變量信息，是由hive-site.xml文件定義的以及當前hive會話定義的環境變量信息。其中第三種hive參數變量信息中又由hadoop hdfs參數(直接是hadoop的)、mapreduce參數、metastore元數據存儲參數、metastore連接參數以及hive運行參數構成。

Hive-0.13.1-cdh5.3.6參數變量信息詳解
參數	默認值	含義(用處)
datanucleus.autoCreateSchema	true	creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once；如果數據元數據不存在，那么直接創建，如果設置為false，那么在之后創建。
datanucleus.autoStartMechanismMode	checked	throw exception if metadata tables are incorrect;如果數據元信息檢查失敗，拋出異常。可選value: checked, unchecked
datanucleus.cache.level2	false	Use a level 2 cache. Turn this off if metadata is changed independently of Hive metastore server; 是否使用二級緩存機制。
datanucleus.cache.level2.type	SOFT	SOFT=soft reference based cache, WEAK=weak reference based cache， none=no cache.二級緩存機制的類型，none是不使用，SOFT表示使用軟引用，WEAK表示使用弱引用。
datanucleus.connectionPoolingType	BoneCP	metastore數據連接池使用。
datanucleus.fixedDatastore	false
datanucleus.identifierFactory	datanucleus1	Name of the identifier factory to use when generating table/column names etc.創建metastore數據庫的工廠類。
datanucleus.plugin.pluginRegistryBundleCheck	LOG	Defines what happens when plugin bundles are found and are duplicated [EXCEPTION\|LOG\|NONE]
datanucleus.rdbms.useLegacyNativeValueStrategy	true
datanucleus.storeManagerType	rdbms	元數據存儲方式
datanucleus.transactionIsolation	read-committed	事務機制，Default transaction isolation level for identity generation.
datanucleus.validateColumns	false	validates existing schema against code. turn this on if you want to verify existing schema,對於存在的表是否進行檢查schema
datanucleus.validateConstraints	false	對於存在的表是否檢查約束
datanucleus.validateTables	false	檢查表
dfs.block.access.key.update.interval	600
hive.archive.enabled	false	Whether archiving operations are permitted；是否允許進行歸檔操作。
hive.auto.convert.join	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size；是否允許進行data join 優化
hive.auto.convert.join.noconditionaltask	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. If this parameter is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a mapjoin (there is no conditional task).針對沒有條件的task，是否直接使用data join。
hive.auto.convert.join.noconditionaltask.size	10000000	If hive.auto.convert.join.noconditionaltask is off, this parameter does not take affect. However, if it is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than this size, the join is directly converted to a mapjoin(there is no conditional task). The default is 10MB；如果${hive.auto.convert.join.noconditionaltask}設置為true，那么表示控制文件的大小值，默認10M；也就是說如果小於10M，那么直接使用data join。
hive.auto.convert.join.use.nonstaged	false	For conditional joins, if input stream from a small alias can be directly applied to join operator without filtering or projection, the alias need not to be pre-staged in distributed cache via mapred local task. Currently, this is not working with vectorization or tez execution engine.對於有條件的數據join，對於小文件是否使用分布式緩存。
hive.auto.convert.sortmerge.join	false	Will the join be automatically converted to a sort-merge join, if the joined tables pass the criteria for sort-merge join.如果可以轉換，自動轉換為標准的sort-merge join方式。
hive.auto.convert.sortmerge.join.bigtable.selection.policy	org.apache.hadoop.hive.ql.optimizer.AvgPartitionSizeBasedBigTableSelectorForAutoSMJ
hive.auto.convert.sortmerge.join.to.mapjoin	false	是否穿件sort-merge join到map join方式
hive.auto.progress.timeout	0	How long to run autoprogressor for the script/UDTF operators (in seconds). Set to 0 for forever. 執行腳本和udtf過期時間，設置為0表示永不過期。
hive.autogen.columnalias.prefix.includefuncname	false	hive自動產生的臨時列名是否加function名稱，默認不加
hive.autogen.columnalias.prefix.label	_c	hive的臨時列名主體部分
hive.binary.record.max.length	1000	hive二進制記錄最長長度
hive.cache.expr.evaluation	true	If true, evaluation result of deterministic expression referenced twice or more will be cached. For example, in filter condition like ".. where key + 10 > 10 or key + 10 = 0" "key + 10" will be evaluated/cached once and reused for following expression ("key + 10 = 0"). Currently, this is applied only to expressions in select or filter operator. 是否允許緩存表達式的執行，默認允許；先階段只緩存select和where中的表達式結果。
hive.cli.errors.ignore	false
hive.cli.pretty.output.num.cols	-1
hive.cli.print.current.db	false	是否顯示當前操作database名稱，默認不顯示
hive.cli.print.header	false	是否顯示具體的查詢頭部信息，默認不顯示。比如不顯示列名。
hive.cli.prompt	hive	hive的前綴提示信息,，修改后需要重新啟動客戶端。
hive.cluster.delegation.token.store.class	org.apache.hadoop.hive.thrift.MemoryTokenStore	hive集群委托token信息存儲類
hive.cluster.delegation.token.store.zookeeper.znode	/hive/cluster/delegation	hive zk存儲
hive.compactor.abortedtxn.threshold	1000	分區壓縮文件閥值
hive.compactor.check.interval	300	壓縮間隔時間，單位秒
hive.compactor.delta.num.threshold	10	子分區閥值
hive.compactor.delta.pct.threshold	0.1	壓縮比例
hive.compactor.initiator.on	false
hive.compactor.worker.threads	0
hive.compactor.worker.timeout	86400	單位秒
hive.compat	0.12	兼容版本信息
hive.compute.query.using.stats	false
hive.compute.splits.in.am	true
hive.conf.restricted.list	hive.security.authenticator.manager,hive.security.authorization.manager
hive.conf.validation	true
hive.convert.join.bucket.mapjoin.tez	false
hive.counters.group.name	HIVE
hive.debug.localtask	false
hive.decode.partition.name	false
hive.default.fileformat	TextFile	指定默認的fileformat格式化器。默認為textfile。
hive.default.rcfile.serde	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe	rcfile對應的序列化類
hive.default.serde	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	默認的序列化類
hive.display.partition.cols.separately	true	hive分區單獨的顯示列名
hive.downloaded.resources.dir	/tmp/${hive.session.id}_resources	hive下載資源存儲文件
hive.enforce.bucketing	false	是否允許使用桶
hive.enforce.bucketmapjoin	false	是否允許桶進行map join
hive.enforce.sorting	false	是否允許在插入的時候使用sort排序。
hive.enforce.sortmergebucketmapjoin	false
hive.entity.capture.transform	false
hive.entity.separator	@	Separator used to construct names of tables and partitions. For example, dbname@tablename@partitionname
hive.error.on.empty.partition	false	Whether to throw an exception if dynamic partition insert generates empty results.當啟用動態hive的時候，如果插入的partition為空，是否拋出異常信息。
hive.exec.check.crossproducts	true	檢查是否包含向量積
hive.exec.compress.intermediate	false	中間結果是否壓縮，壓縮機制采用hadoop的配置信息mapred.output.compress*
hive.exec.compress.output	false	最終結果是否壓縮
hive.exec.concatenate.check.index	true
hive.exec.copyfile.maxsize	33554432
hive.exec.counters.pull.interval	1000
hive.exec.default.partition.name	__HIVE_DEFAULT_PARTITION__
hive.exec.drop.ignorenonexistent	true	當執行刪除的時候是否忽略不存在的異常信息，默認忽略，如果忽略，那么會報錯。
hive.exec.dynamic.partition	true	是否允許動態指定partition，如果允許的話，那么我們修改內容的時候可以不指定partition的值。
hive.exec.dynamic.partition.mode	strict	動態partition模式，strict模式要求至少給定一個靜態的partition值。nonstrict允許全部partition為動態的值。
hive.exec.infer.bucket.sort	false
hive.exec.infer.bucket.sort.num.buckets.power.two	false
hive.exec.job.debug.capture.stacktraces	true
hive.exec.job.debug.timeout	30000
hive.exec.local.scratchdir	/tmp/hadoop
hive.exec.max.created.files	100000	在mr程序中最大創建的hdfs文件個數
hive.exec.max.dynamic.partitions	1000	動態分區的總的分區最大個數
hive.exec.max.dynamic.partitions.pernode	100	每個MR節點的最大創建個數
hive.exec.mode.local.auto	false	是否允許hive運行本地模式
hive.exec.mode.local.auto.input.files.max	4	hive本地模式最大輸入文件數量
hive.exec.mode.local.auto.inputbytes.max	134217728	hive本地模式組大輸入字節數
hive.exec.orc.default.block.padding	true
hive.exec.orc.default.buffer.size	262144
hive.exec.orc.default.compress	ZLIB
hive.exec.orc.default.row.index.stride	10000
hive.exec.orc.default.stripe.size	268435456
hive.exec.orc.dictionary.key.size.threshold	0.8
hive.exec.orc.memory.pool	0.5
hive.exec.orc.skip.corrupt.data	false
hive.exec.orc.zerocopy	false
hive.exec.parallel	false	是否允許並行執行，默認不允許。
hive.exec.parallel.thread.number	8	並行執行線程個數，默認8個。
hive.exec.perf.logger	org.apache.hadoop.hive.ql.log.PerfLogger
hive.exec.rcfile.use.explicit.header	true
hive.exec.rcfile.use.sync.cache	true
hive.exec.reducers.bytes.per.reducer	1000000000	size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers. 默認reducer節點處理數據的規模，默認1G。
hive.exec.reducers.max	999	reducer允許的最大個數。當mapred.reduce.tasks指定為負值的時候，該參數起效。
hive.exec.rowoffset	false
hive.exec.scratchdir	/etc/hive-hadoop
hive.exec.script.allow.partial.consumption	false
hive.exec.script.maxerrsize	100000
hive.exec.script.trust	false
hive.exec.show.job.failure.debug.info	true
hive.exec.stagingdir	.hive-staging
hive.exec.submitviachild	false
hive.exec.tasklog.debug.timeou	20000
hive.execution.engine	mr	執行引擎mr或者Tez(hadoop2)
hive.exim.uri.scheme.whitelist	hdfs,pfile
hive.explain.dependency.append.tasktype	false
hive.fetch.output.serde	org.apache.hadoop.hive.serde2.DelimitedJSONSerDe
hive.fetch.task.aggr	false
hive.fetch.task.conversion	minimal
hive.fetch.task.conversion.threshold	-1
hive.file.max.footer	100
hive.fileformat.check	true
hive.groupby.mapaggr.checkinterval	100000
hive.groupby.orderby.position.alias	false
hive.groupby.skewindata	false
hive.hadoop.supports.splittable.combineinputformat	false
hive.hashtable.initialCapacity	100000
hive.hashtable.loadfactor	0.75
hive.hbase.generatehfiles	false
hive.hbase.snapshot.restoredir	/tmp
hive.hbase.wal.enabled	true
hive.heartbeat.interval	1000
hive.hmshandler.force.reload.conf	false
hive.hmshandler.retry.attempts	1
hive.hmshandler.retry.interval	1000
hive.hwi.listen.host	0.0.0.0
hive.hwi.listen.port	9999
hive.hwi.war.file	lib/hive-hwi-${version}.war
hive.ignore.mapjoin.hint	true
hive.in.test	false
hive.index.compact.binary.search	true
hive.index.compact.file.ignore.hdfs	false
hive.index.compact.query.max.entries	10000000
hive.index.compact.query.max.size	10737418240
hive.input.format	org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
hive.insert.into.external.tables	true
hive.insert.into.multilevel.dirs	false
hive.jobname.length	50
hive.join.cache.size	25000
hive.join.emit.interval	1000
hive.lazysimple.extended_boolean_literal	false
hive.limit.optimize.enable	false
hive.limit.optimize.fetch.max	50000
hive.limit.optimize.limit.file	10
hive.limit.pushdown.memory.usage	-1.0
hive.limit.query.max.table.partition	-1
hive.limit.row.max.size	100000
hive.localize.resource.num.wait.attempts	5
hive.localize.resource.wait.interval	5000
hive.lock.manager	org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
hive.mapred.partitioner	org.apache.hadoop.hive.ql.io.DefaultHivePartitioner
hive.mapred.reduce.tasks.speculative.execution	true
hive.mapred.supports.subdirectories	false
hive.metastore.uris	thrift://hh:9083
hive.metastore.warehouse.dir	/user/hive/warehouse
hive.multi.insert.move.tasks.share.dependencies	false
hive.multigroupby.singlereducer	true
hive.zookeeper.clean.extra.nodes	false	在會話結束的時候是否清楚額外的節點數據
hive.zookeeper.client.port	2181	客戶端端口號
hive.zookeeper.quorum		zk的服務器端ip
hive.zookeeper.session.timeout	600000	zk的client端會話過期時間
hive.zookeeper.namespace	hive_zookeeper_namespace
javax.jdo.PersistenceManagerFactoryClass	org.datanucleus.api.jdo.JDOPersistenceManagerFactory
javax.jdo.option.ConnectionDriverName	改為：com.mysql.jdbc.Driver
javax.jdo.option.ConnectionPassword	改為：hive
javax.jdo.option.ConnectionURL	xxx
javax.jdo.option.ConnectionUserName	xxx
javax.jdo.option.DetachAllOnCommit	true
javax.jdo.option.Multithreaded	true
javax.jdo.option.NonTransactionalRead	true

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hive配置項的含義詳解 Hive配置項的含義詳解 Hive詳解 Hive詳解 Hive詳解 Hive詳解 hive & hive beeline常用參數 Hive學習之六《Hive進階— —hive jdbc》詳解【hive】——Hive sql語法詳解 Hive