問題原因:之前hive里有數據,后面MySQL數據庫壞了,導致hive元數據信息丟失,但是hdfs上hive表的數據並沒有丟失,重新建表后查看hive分區沒有,數據也沒有。需要進行修復。
解決方法:可以使用msck repair table xxxxx命令修復!
msck repari table table_name;
msck命令解析:MSCK REPAIR TABLE
命令主要是用來解決通過hdfs dfs -put或者hdfs api寫入hive分區表的數據在hive中無法被查詢到的問題。
我們知道hive有個服務叫metastore,這個服務主要是存儲一些元數據信息,比如數據庫名,表名或者表的分區等等信息。如果不是通過hive的insert等插入語句,很多分區信息在metastore中是沒有的,如果插入分區數據量很多的話,你用 ALTER TABLE table_name ADD PARTITION
一個個分區添加十分麻煩。這時候MSCK REPAIR TABLE
就派上用場了。只需要運行MSCK REPAIR TABLE
命令,hive就會去檢測這個表在hdfs上的文件,把沒有寫入metastore的分區信息寫入metastore。
例子
我們先創建一個分區表,然后往其中的一個分區插入一條數據,在查看分區信息
CREATE TABLE repair_test (col_a STRING) PARTITIONED BY (par STRING); INSERT INTO TABLE repair_test PARTITION(par="partition_1") VALUES ("test"); SHOW PARTITIONS repair_test;
查看分區信息的結果如下
jdbc:hive2://localhost:10000> show partitions repair_test; INFO : Compiling command(queryId=hive_20180810175151_5260f52e-10bb-4589-ad48-31ba72a81c21): show partitions repair_test INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:partition, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=hive_20180810175151_5260f52e-10bb-4589-ad48-31ba72a81c21); Time taken: 0.029 seconds INFO : Executing command(queryId=hive_20180810175151_5260f52e-10bb-4589-ad48-31ba72a81c21): show partitions repair_test INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20180810175151_5260f52e-10bb-4589-ad48-31ba72a81c21); Time taken: 0.017 seconds INFO : OK +------------------+--+ | partition | +------------------+--+ | par=partition_1 | +------------------+--+ 1 row selected (0.073 seconds) 0: jdbc:hive2://localhost:10000>
然后我們通過hdfs的put命令手動創建一個數據
[ericsson@h3cnamenode1 pcc]$ echo "123123" > test.txt [ericsson@h3cnamenode1 pcc]$ hdfs dfs -mkdir -p /user/hive/warehouse/test.db/repair_test/par=partition_2/ [ericsson@h3cnamenode1 pcc]$ hdfs dfs -put -f test.txt /user/hive/warehouse/test.db/repair_test/par=partition_2/ [ericsson@h3cnamenode1 pcc]$ hdfs dfs -ls -R /user/hive/warehouse/test.db/repair_test drwxrwxrwt - ericsson hive 0 2018-08-10 17:46 /user/hive/warehouse/test.db/repair_test/par=partition_1 drwxrwxrwt - ericsson hive 0 2018-08-10 17:46 /user/hive/warehouse/test.db/repair_test/par=partition_1/.hive-staging_hive_2018-08-10_17-45-59_029_1594310228554990949-1 drwxrwxrwt - ericsson hive 0 2018-08-10 17:46 /user/hive/warehouse/test.db/repair_test/par=partition_1/.hive-staging_hive_2018-08-10_17-45-59_029_1594310228554990949-1/-ext-10000 -rwxrwxrwt 3 ericsson hive 5 2018-08-10 17:46 /user/hive/warehouse/test.db/repair_test/par=partition_1/000000_0 drwxr-xr-x - ericsson hive 0 2018-08-10 17:57 /user/hive/warehouse/test.db/repair_test/par=partition_2 -rw-r--r-- 3 ericsson hive 7 2018-08-10 17:57 /user/hive/warehouse/test.db/repair_test/par=partition_2/test.txt [ericsson@h3cnamenode1 pcc]$
這時候我們查詢分區信息,發現partition_2這個分區並沒有加入到hive中
0: jdbc:hive2://localhost:10000> show partitions repair_test; INFO : Compiling command(queryId=hive_20180810175959_e7cefe8c-57b5-486c-8e03-b1201dac4d79): show partitions repair_test INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:partition, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=hive_20180810175959_e7cefe8c-57b5-486c-8e03-b1201dac4d79); Time taken: 0.029 seconds INFO : Executing command(queryId=hive_20180810175959_e7cefe8c-57b5-486c-8e03-b1201dac4d79): show partitions repair_test INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20180810175959_e7cefe8c-57b5-486c-8e03-b1201dac4d79); Time taken: 0.02 seconds INFO : OK +------------------+--+ | partition | +------------------+--+ | par=partition_1 | +------------------+--+ 1 row selected (0.079 seconds) 0: jdbc:hive2://localhost:10000>
運行MSCK REPAIR TABLE 命令后再查詢分區信息,可以看到通過put命令放入的分區已經可以查詢了
0: jdbc:hive2://localhost:10000> MSCK REPAIR TABLE repair_test; INFO : Compiling command(queryId=hive_20180810180000_7099daf2-6fde-44dd-8938-d2a02589358f): MSCK REPAIR TABLE repair_test INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20180810180000_7099daf2-6fde-44dd-8938-d2a02589358f); Time taken: 0.004 seconds INFO : Executing command(queryId=hive_20180810180000_7099daf2-6fde-44dd-8938-d2a02589358f): MSCK REPAIR TABLE repair_test INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20180810180000_7099daf2-6fde-44dd-8938-d2a02589358f); Time taken: 0.138 seconds INFO : OK No rows affected (0.154 seconds) 0: jdbc:hive2://localhost:10000> show partitions repair_test; INFO : Compiling command(queryId=hive_20180810180000_ff711820-6f41-4d5d-9fee-b6e1cdbe1e25): show partitions repair_test INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:partition, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=hive_20180810180000_ff711820-6f41-4d5d-9fee-b6e1cdbe1e25); Time taken: 0.045 seconds INFO : Executing command(queryId=hive_20180810180000_ff711820-6f41-4d5d-9fee-b6e1cdbe1e25): show partitions repair_test INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20180810180000_ff711820-6f41-4d5d-9fee-b6e1cdbe1e25); Time taken: 0.016 seconds INFO : OK +------------------+--+ | partition | +------------------+--+ | par=partition_1 | | par=partition_2 | +------------------+--+ 2 rows selected (0.088 seconds) 0: jdbc:hive2://localhost:10000> select * from repair_test; INFO : Compiling command(queryId=hive_20180810180101_1225075e-43c8-4a49-b8ef-a12f72544a38): select * from repair_test INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:repair_test.col_a, type:string, comment:null), FieldSchema(name:repair_test.par, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20180810180101_1225075e-43c8-4a49-b8ef-a12f72544a38); Time taken: 0.059 seconds INFO : Executing command(queryId=hive_20180810180101_1225075e-43c8-4a49-b8ef-a12f72544a38): select * from repair_test INFO : Completed executing command(queryId=hive_20180810180101_1225075e-43c8-4a49-b8ef-a12f72544a38); Time taken: 0.001 seconds INFO : OK +--------------------+------------------+--+ | repair_test.col_a | repair_test.par | +--------------------+------------------+--+ | test | partition_1 | | 123123 | partition_2 | +--------------------+------------------+--+ 2 rows selected (0.121 seconds) 0: jdbc:hive2://localhost:10000>
后續
后面發生了更有意思的事情。大致情況是很多人以為alter table drop partition只能刪除一個分區的數據,結果用hdfs dfs -rmr 刪除hive分區表的hdfs文件。這就導致了一個問題hdfs上的文件雖然刪除了,但是hive metastore中的原信息沒有刪除。如果用show parttions table_name 這些分區信息還在,需要把這些分區原信息清除。
后來我想看看MSCK REPAIR TABLE這個命令能否刪除已經不存在hdfs上的表分區信息,發現不行,我去jira查了下,發現Fix Version/s: 3.0.0, 2.4.0, 3.1.0 這幾個版本的hive才支持這個功能。但由於我們的hive版本是1.1.0-cdh5.11.0, 這個方法無法使用。
附上官網的鏈接
Recover Partitions (MSCK REPAIR TABLE)
Recover Partitions (MSCK REPAIR TABLE)
Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (say by using
hadoop fs -put
command) or removed from HDFS, the metastore (and hence Hive) will not be aware of these changes to partition information unless the user runsALTER TABLE table_name ADD/DROP PARTITION
commands on each of the newly added or removed partitions, respectively.
However, users can run a metastore check command with the repair table option:
MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS];
which will update metadata about partitions to the Hive metastore for partitions for which such metadata doesn't already exist. The default option for MSC command is ADD PARTITIONS. With this option, it will add any partitions that exist on HDFS but not in metastore to the metastore. The DROP PARTITIONS option will remove the partition information from metastore, that is already removed from HDFS. The SYNC PARTITIONS option is equivalent to calling both ADD and DROP PARTITIONS. See HIVE-874 and HIVE-17824 for more details. When there is a large number of untracked partitions, there is a provision to run MSCK REPAIR TABLE batch wise to avoid OOME (Out of Memory Error). By giving the configured batch size for the property hive.msck.repair.batch.size it can run in the batches internally. The default value of the property is zero, it means it will execute all the partitions at once. MSCK command without the REPAIR option can be used to find details about metadata mismatch metastore.
The equivalent command on Amazon Elastic MapReduce (EMR)'s version of Hive is:
ALTER TABLE table_name RECOVER PARTITIONS;
Starting with Hive 1.3, MSCK will throw exceptions if directories with disallowed characters in partition values are found on HDFS. Use hive.msck.path.validation setting on the client to alter this behavior; "skip" will simply skip the directories. "ignore" will try to create partitions anyway (old behavior). This may or may not work.
HIVE-17824 是關於hive msck repair 增加清理metastore中已經不在hdfs上的分區信息
轉自:鏈接:https://www.jianshu.com/p/c1b0dc86f9b0