問題現象:通過巡檢發現/u01/ogg目錄下100G 空間,使用率97%
1.立即清空幾個rpt進程日志文件,空間釋放一部分。
$cd /u01/ogg/dirrpt $ls -lrt $> xx.rpt $ more Rxx.rpt Operating System Version: Linux Version #1 SMP Tue Feb 26 12:53:17 EST 2018, Release 2.6.32-696.el6.x86_64 Node: dsapdb21 Machine: x86_64 soft limit hard limit Address Space Size : unlimited unlimited Heap Size : unlimited unlimited File Size : unlimited unlimited CPU Time : unlimited unlimited Process id: 656457 ······ 清空rpt日志,應急
2.檢查哪些文件占用的?
通過df -h used- /u01/ogg du -sm 大小,發現存在50G空間不見了???
通過
$ lsof|grep deleted >> lsof_deleted_20200508.log 發現存在大量大量的日志信息,類似僵死無法被刪除
簡短 more 觀察 oracle ocssd.bin這些暫時忽略,最大的問題是replicat的進程非常非常多的文件刪除操作!!!
ohasd.bin
gpnpd.bin
osysmond.
ocssd.bin
ocssd.bin
oracle /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_vmb0_310782.trc (deleted)
replicat
replicat
replicat
replicat
replicat
replicat 520060 oracle 81w REG 251,23553 9673428 37733 /u01/ogg/adapter/ndslogs/NdsJdbcTrace.log (deleted)
replicat 520060 oracle 85w REG 251,23553 5397502500 39821 /u01/ogg/adapter/ndslogs/sgcc.nds.jdbc.driver.NdsConnection@xxx.log (deleted)
replicat 650196 oracle 81w REG 251,23553 9673428 37733 /u01/ogg/adapter/ndslogs/NdsJdbcTrace.log (deleted)
replicat 650196 oracle 85w REG 251,23553 37155915949 34364 /u01/ogg/adapter/ndslogs/sgcc.nds.jdbc.driver.NdsConnection@xxx.log (deleted)
# ps -ef|grep 650196
oracle 650196 78557 3 May07 ? 01:03:07 /u01/ogg/replicat PARAMFILE /u01/ogg/dirprm/a.prm REPORTFILE a.rpt PROCESSID a USESUBDIRS
# ps -ef|grep 520060
oracle 520060 1 36 Apr30 ? 2-22:30:12 /u01/ogg/replicat PARAMFILE /u01/ogg/dirprm/b.prm REPORTFILE b.rpt PROCESSID b USESUBDIRS
$ogg >info all --觀察進程都是正常的!!!
$ogg>info * 備份進程rba
$ogg>stop R*
$ lsof|grep deleted
null --記錄都被清空
GGSCI > start r* --重啟ogg 進程問題解決
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/asm/acfsvol-99999999 100G 25G 76G 25% /u01/ogg
Redhat6.9 ogg Version 12.2.0.1.160823版本,復制應用進程REP,大量進程刪除操作無法正常刪除,由於ogg進程占用導致無法正常刪除,具體內部機制為什么被占用,此問題暫無法分析。