1.簡單描述:
凌晨5點左右通過監控軟件發現嚴重事件:ARCH: Archival stopped, error occurred. Will continue retrying
2.報錯信息:
數據庫警告日志顯示:
[oracle@ods1 ~]$ adrci
ADRCI: Release 11.2.0.4.0 - Production on Thu Mar 30 09:00:10 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/oracle"
adrci> show alert
Choose the alert log from the following homes to view:
1: diag/rdbms/ods/ods1
2: diag/asm/user_oracle/host_1115430328_80
3: diag/diagtool/user_oracle/host_1115430328_80
4: diag/asmtool/user_oracle/host_1115430328_80
5: diag/clients/user_oracle/host_1115430328_80
6: diag/clients/user_oracle/host_1115430328_11
Q: to quit
Please select option: q
adrci> exit
[oracle@ods1 ~]$
[oracle@ods1 ~]$ tail /u01/app/oracle/diag/rdbms/ods/ods1/trace/alertxxx.log
Thu Mar 30 05:15:34 2017
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance ods1 - Archival Error
ORA-16014: log 1 sequence# 8465 not archived, no available destinations
ORA-00312: online log 1 thread 1: '+DATA/ods/redo01.log'
Thu Mar 30 05:16:19 2017
minact-scn: useg scan erroring out with error e:12751
Thu Mar 30 05:16:29 2017
DIA0 detected that there is an archiving problem on the cluster. Several
processes including LGWR and at least one ARC process are hung.
Instance termination is not allowed. External intervention
is required.
Thu Mar 30 05:18:35 2017
DIA0 detected that there is an archiving problem on the cluster. Several
processes including LGWR and at least one ARC process are hung.
Instance termination is not allowed. External intervention
is required.
Thu Mar 30 05:20:34 2017
Unable to create archive log file '+BAK/ods/archivelog/1_8465_908665003.dbf'
ARC1: Error 19504 Creating archive log file to '+BAK/ods/archivelog/1_8465_908665003.dbf'
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance ods1 - Archival Error
ORA-16038: log 1 sequence# 8465 cannot be archived
ORA-19504: failed to create file ""
ORA-00312: online log 1 thread 1: '+DATA/ods/redo01.log'
ARCH: Archival stopped, error occurred. Will continue retrying
3.故障處理步驟
SQL> select dest_name,status ,target ,archiver,destination from v$archive_dest;
DEST_NAME STATUS TARGET ARCHIVER DESTINATION
------------------------------ --------- ------- ---------- ------------------------------
LOG_ARCHIVE_DEST_1 INACTIVE PRIMARY ARCH +BAK/ods/archivelog/
LOG_ARCHIVE_DEST_2 INACTIVE PRIMARY ARCH
LOG_ARCHIVE_DEST_3 VALID STANDBY LGWR odsdg
LOG_ARCHIVE_DEST_4 INACTIVE PRIMARY ARCH
LOG_ARCHIVE_DEST_5 INACTIVE PRIMARY ARCH
LOG_ARCHIVE_DEST_6 INACTIVE PRIMARY ARCH
LOG_ARCHIVE_DEST_7 INACTIVE PRIMARY ARCH
LOG_ARCHIVE_DEST_8 INACTIVE PRIMARY ARCH
正常。並得知歸檔存放在ASM +BAK目錄下。
檢查ASM: BAK只剩1G
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 4096 1048576 1048574 914 0 914 0 N ACFS_VOL/
MOUNTED EXTERN N 512 4096 1048576 1048574 1444 0 276225 0 N BAK/
MOUNTED NORMAL N 512 4096 1048576 6138 5212 2046 1583 0 Y CRS/
MOUNTED EXTERN N 512 4096 1048576 7340018 2058832 0 2058832 0 N DATA/
ASMCMD>
檢查BAK內容發現是過快增長的歸檔日志占用,詢問原因是:大表全表update的定時調度造成redo切換頻繁,歸檔日志是平時的2倍左右。
主庫手工釋放:
delete noprompt archivelog all completed before 'sysdate - 1';
備庫手工釋放:
select 'delete noprompt archivelog until sequence '||max(sequence#)||' thread '||thread#||';' from v$archived_log where first_time <=systimestamp-1 and applied='YES' and deleted='NO' group by thread#;