客戶報告,用RMAN 的 duplicate 命令,在具備RAC環境的standby 端,創建standby 數據庫時,失敗。
報:ORA-19504、ORA-17502、ORA-15001、ORA-27140
執行的過程如下:
[oracle @ racddb001g ~] $ export ORACLE_SID = tmt011 [oracle @ racddb001g ~] $ [oracle @ racddb001g ~] $ sqlplus/as sysdba SQL * Plus: Release 12.2.0.1.0 Production on Fri April 4 02:26:18 2021 Copyright (c) 1982, 2016, Oracle. All rights reserved. Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 --64bit Production Connected to. SQL> shutdown immediate ORA-01507: database is not mounted The ORACLE instance has been shut down. SQL> startup nomount pfile ='/media/dg/stby_inittmt01.ora' The ORACLE instance has started. Total System Global Area 1593835520 bytes Fixed Size 8421136 bytes Variable Size 453985776 bytes Database Buffers 848860800 bytes Redo Buffers 294367808 bytes SQL> exit [oracle@racddb001g ~]$ export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss' [oracle@racddb001g ~]$ rman target 'sys@tmt01H' auxiliary / RMAN> duplicate target database for standby dorecover nofilenamecheck; Duplicate Db started at 2021/04/04 02:28:29 Channel: ORA_AUX_DISK_1 assigned Channel ORA_AUX_DISK_1: SID = 30 Instance = tmt011 Device Type = DISK The current log has been archived. Memory script content: { set until scn 5645034; restore clone standby controlfile; } Running a memory script Execution command: SET until clause restore is starting at 2021/04/02 02:28:40 Use of channel ORA_AUX_DISK_1 Channel ORA_AUX_DISK_1: Restoring control file ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl ORA-15001: diskgroup "DG001" does not exist or is not mounted ORA-27140: attach to post/wait facility failed Failover to previous backup Channel ORA_AUX_DISK_1: Restoring control file ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl ORA-15001: diskgroup "DG001" does not exist or is not mounted ORA-27140: attach to post/wait facility failed Failover to previous backup Channel ORA_AUX_DISK_1: Starting restore of datafile backup set Channel ORA_AUX_DISK_1: Restoring control file Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01 Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01 ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl Failover to previous backup Channel ORA_AUX_DISK_1: Starting restore of datafile backup set Channel ORA_AUX_DISK_1: Restoring control file Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00 Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00 ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl Failover to previous backup Channel ORA_AUX_DISK_1: Starting restore of datafile backup set Channel ORA_AUX_DISK_1: Restoring control file Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1 Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1 ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl Failover to previous backup Channel ORA_AUX_DISK_1: Starting restore of datafile backup set Channel ORA_AUX_DISK_1: Restoring control file Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1 Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1 ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl Failover to previous backup Channel ORA_AUX_DISK_1: Starting restore of datafile backup set Channel ORA_AUX_DISK_1: Restoring control file Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00 Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00 ORA-19505: failed to identify file "/my/oracle/dbhome_1/dbs/c-2060537070-20210320-00". ORA-27037: unable to get file status Failover to previous backup RMAN-00571: ============================================== ============= RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ============================================== ============= RMAN-03002: Duplicate Db command failed at 04/02/2021 02:45:39 RMAN-05501: abort copy of target database RMAN-03015: error in stored script Memory Script RMAN-06026: missing target--stop restore RMAN-06024: cannot find backup or copy to restore control file RMAN>
從上面的出錯信息上看,這是duplicate過程創建輔助instance,向磁盤組 restore 控制文件失敗了:
Channel ORA_AUX_DISK_1: Restoring control file ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl ORA-15001: diskgroup "DG001" does not exist or is not mounted ORA-27140: attach to post/wait facility failed Failover to previous backup
首先看看用戶oracle 有沒有對磁盤組的讀寫權限:
SQL> select NAME,STATE from v$asm_diskgroup; NAME ------------------ STATE ------------------ DG001 MOUNTED SQL> select name,PATH from v$asm_disk; NAME ------------------------------------- PATH ------------------------------------- DG001_0000 /dev/mapper/ora01
查看磁盤組對應的物理磁盤,發現映射到 /dev/dm-11 設備。
$ ls -l /dev/mapper/ora01 lrwxrwxrwx 1 root root 8 Mar 27 00:52 /dev/mapper/ora01 -> ../dm-11
該設備的 owner 和組,分別是 grid:asmadmin。
$ ls -l /dev/dm-* brw-rw---- 1 grid asmadmin 253, 10 Mar 27 00:58 /dev/dm-10 brw-rw---- 1 grid asmadmin 253, 11 Mar 27 00:58 /dev/dm-11
看一下 grid 用戶的組:
# su - grid $ id uid=10000(grid) gid=11000(oinstall) groups=11000(oinstall),11002(asmadmin) ,11003(asmdba),11004(asmoper)
再看一下oracle 用戶的組:
# su - oracle $ id uid=10001(oracle) gid=11000(oinstall) groups=11000(oinstall),11001(dba),11003(asmdba),11005(racdba),11006(backupdba),11007(dgdba),11008(kmdba),11009(oper)
發現 oracle 用戶,根本不在 asmadmin 組里。可能是因為這個原因,沒有辦法訪問數據庫。
請客戶把 oracle 用戶也加入到 asmadmin 中,但是再次執行duplicate 沒有什么變化:
usermod -a -G asmadmin oracle
現在,還需要考慮其它的原因,就是 oracle 可執行文件,是否有權限以 grid 用戶身份運行。
<primary> # su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 19 00:22 /my/oracle/dbhome_1/bin/oracle [root@rachdb001g ~]# su - grid 最終ログイン: 2021/04/01 (木) 16:29:18 JST [grid@rachdb001g ~]$ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 grid oinstall 373409344 Mar 18 03:05 /opt/oracle/grid/12.2.0/grid/bin/oracle [grid@rachdb001g ~]$ <standby> #su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 24 22:19 /my/oracle/dbhome_1/bin/oracle # su - grid $ ls -l $ORACLE_HOME/bin/oracle -rwxr-x--x 1 grid oinstall 373409344 Mar 20 01:56 /opt/oracle/grid/12.2.0/grid/bin/oracle
可以看到,主庫和備庫上的 grid 用戶的 $ORACLE_HOME/bin/oracle 的權限是不一樣的。
一個是: -rwsr-x--x ,一個是 -rwxr-x--x。
需要進行設置:
chown grid:oinstall $GI_HOME/bin/oracle chmod 6751 $GI_HOME/bin/oracle
重新啟動 standby 端的輔助instance,再次執行 duplicate, 已經可以成功執行。