Hadoop一直处于安全模式(hadoop去掉保护模式)


转自:hive的hiveserver2模式启动不起来,发现Hadoop一直处于安全模式 - -小鱼- - 博客园 (cnblogs.com)

hadoop去掉保护模式

命令hadoop fs –safemode get 查看安全模式状态
命令hadoop fs –safemode enter 进入安全模式状态
命令hadoop fs –safemode leave 离开安全模式状态

  hdfs dfsadmin -safemode leave

第一步:用hadoop fsck 检查hadoop文件系统

命令:Hadoop fsck

 1 hadoop  fsck
 2 
 3 Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
 4         <path>             检查这个目录中的文件是否完整
 5 
 6         -move               破损的文件移至/lost+found目录
 7         -delete             删除破损的文件
 8 
 9         -openforwrite   打印正在打开写操作的文件
10 
11         -files                 打印正在check的文件名
12 
13         -blocks             打印block报告 (需要和-files参数一起使用)
14 
15         -locations         打印每个block的位置信息(需要和-files参数一起使用)
16 
17         -racks               打印位置信息的网络拓扑图 (需要和-files参数一起使用)

观察hadoop集群状态

CORRUPT FILES: 2 #损坏了两个文件
MISSING BLOCKS: 2 #丢失了两个块

 1 Status: CORRUPT  2  Number of data-nodes:    1
 3  Number of racks:        1
 4  Total dirs:            72
 5  Total symlinks:        0
 6 
 7 Replicated Blocks:
 8  Total size:    574348338 B
 9  Total files:    76
10  Total blocks (validated):    74 (avg. block size 7761464 B)
11   ********************************
12   UNDER MIN REPL'D BLOCKS:    35 (47.2973 %)
13   MINIMAL BLOCK REPLICATION:    1
14   CORRUPT FILES:    35
15   MISSING BLOCKS:    35
16   MISSING SIZE:        327840971 B
17   ********************************
18  Minimally replicated blocks:    39 (52.7027 %)
19  Over-replicated blocks:    0 (0.0 %)
20  Under-replicated blocks:    38 (51.351353 %)
21  Mis-replicated blocks:        0 (0.0 %)
22  Default replication factor:    3
23  Average block replication:    0.527027
24  Missing blocks:        35
25  Corrupt blocks:        0
26  Missing replicas:        160 (38.46154 %)
27 
28 Erasure Coded Block Groups:
29  Total size:    0 B
30  Total files:    0
31  Total block groups (validated):    0
32  Minimally erasure-coded block groups:    0
33  Over-erasure-coded block groups:    0
34  Under-erasure-coded block groups:    0
35  Unsatisfactory placement block groups:    0
36  Average block group size:    0.0
37  Missing block groups:        0
38  Corrupt block groups:        0
39  Missing internal blocks:    0
40 FSCK ended at Wed Jan 12 20:13:16 CST 2022 in 28 milliseconds

第二步 打印hadoop文件的状态信息

命令:hadoop fsck / -files -blocks -locations -racks

可以看到正常文件后面都有ok字样,有MISSING!字样的就是丢失的文件。

/tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1641978125772_0001_conf.xml_tmp 0 bytes, replicated: replication=3, 0 block(s): OK

/tmp/hadoop-yarn/staging/root/.staging/job_1641780921957_0001/job.split 315 bytes, replicated: replication=10, 1 block(s): MISSING 1 blocks of total size 315 B

根据这个的路劲可以在hadoop浏览器界面中找到对应的文件路径,如下图

hadoop fs -chmod -R  777 /tmp 

 

 

 

第三步:修复两个丢失、损坏的文件

 

[root@node03 conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080 -retries 10

[root@node03 conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata -retries 10

 

..Status: HEALTHY 集群状态:健康

现在重新启动hadoop就不会一直处于安全模式了,hiveserver2也能正常启动了。。

第四:意外状况

如果修复不了,或者提示修复成功但是集群状态还是下面这样:

.............Status: CORRUPT					#Hadoop状态:不正常 Total size: 273821489 B Total dirs: 403 Total files: 213 Total symlinks: 0 Total blocks (validated): 201 (avg. block size 1362295 B) ******************************** UNDER MIN REPL'D BLOCKS: 2 (0.99502486 %) dfs.namenode.replication.min: 1 CORRUPT FILES: 2 #损坏了两个文件 MISSING BLOCKS: 2 #丢失了两个块 MISSING SIZE: 6174 B CORRUPT BLOCKS: 2 ******************************** Minimally replicated blocks: 199 (99.004974 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.8208954 Corrupt blocks: 2 Missing replicas: 0 (0.0 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Fri Aug 23 10:43:11 CST 2019 in 12 milliseconds 

1、如果损坏的文件不重要

首先:将找到的损坏文件备份好

然后:执行[root@node03 export]# hadoop fsck / -delete将损坏文件删除

[root@node03 export]# hadoop fsck / -delete 

此命令一次不成功可以多试几次,前提是丢失、损坏的文件不重要!!!!!!!!!!

2、如果损坏的文件很重要不能丢失

可以先执行此命令:hadoop fs –safemode leave 强制离开安全模式状态

[root@node03 export]# hadoop fs –safemode leave 

此操作不能完全解决问题,只能暂时让集群能够工作!!!!

而且,以后每次启动hadoop集群都要执行此命令,直到问题彻底解决。

如果并非以上问题请转这篇:
https://www.cnblogs.com/-xiaoyu-/p/12158984.html

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM