客戶報告,用RMAN 的 duplicate 命令,在具備RAC環境的standby 端,創建standby 數據庫時,失敗。
報:ORA-19504、ORA-17502、ORA-15001、ORA-27140
執行的過程如下:
[oracle @ racddb001g ~] $ export ORACLE_SID = tmt011
[oracle @ racddb001g ~] $
[oracle @ racddb001g ~] $ sqlplus/as sysdba
SQL * Plus: Release 12.2.0.1.0 Production on Fri April 4 02:26:18 2021
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 --64bit Production
Connected to.
SQL> shutdown immediate
ORA-01507: database is not mounted
The ORACLE instance has been shut down.
SQL> startup nomount pfile ='/media/dg/stby_inittmt01.ora'
The ORACLE instance has started.
Total System Global Area 1593835520 bytes
Fixed Size 8421136 bytes
Variable Size 453985776 bytes
Database Buffers 848860800 bytes
Redo Buffers 294367808 bytes
SQL> exit
[oracle@racddb001g ~]$ export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'
[oracle@racddb001g ~]$ rman target 'sys@tmt01H' auxiliary /
RMAN> duplicate target database for standby dorecover nofilenamecheck;
Duplicate Db started at 2021/04/04 02:28:29
Channel: ORA_AUX_DISK_1 assigned
Channel ORA_AUX_DISK_1: SID = 30 Instance = tmt011 Device Type = DISK
The current log has been archived.
Memory script content:
{
set until scn 5645034;
restore clone standby controlfile;
}
Running a memory script
Execution command: SET until clause
restore is starting at 2021/04/02 02:28:40
Use of channel ORA_AUX_DISK_1
Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup
Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
ORA-19505: failed to identify file "/my/oracle/dbhome_1/dbs/c-2060537070-20210320-00".
ORA-27037: unable to get file status
Failover to previous backup
RMAN-00571: ============================================== =============
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ============================================== =============
RMAN-03002: Duplicate Db command failed at 04/02/2021 02:45:39
RMAN-05501: abort copy of target database
RMAN-03015: error in stored script Memory Script
RMAN-06026: missing target--stop restore
RMAN-06024: cannot find backup or copy to restore control file
RMAN>
從上面的出錯信息上看,這是duplicate過程創建輔助instance,向磁盤組 restore 控制文件失敗了:
Channel ORA_AUX_DISK_1: Restoring control file ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl ORA-15001: diskgroup "DG001" does not exist or is not mounted ORA-27140: attach to post/wait facility failed Failover to previous backup
首先看看用戶oracle 有沒有對磁盤組的讀寫權限:
SQL> select NAME,STATE from v$asm_diskgroup; NAME ------------------ STATE ------------------ DG001 MOUNTED SQL> select name,PATH from v$asm_disk; NAME ------------------------------------- PATH ------------------------------------- DG001_0000 /dev/mapper/ora01
查看磁盤組對應的物理磁盤,發現映射到 /dev/dm-11 設備。
$ ls -l /dev/mapper/ora01 lrwxrwxrwx 1 root root 8 Mar 27 00:52 /dev/mapper/ora01 -> ../dm-11
該設備的 owner 和組,分別是 grid:asmadmin。
$ ls -l /dev/dm-* brw-rw---- 1 grid asmadmin 253, 10 Mar 27 00:58 /dev/dm-10 brw-rw---- 1 grid asmadmin 253, 11 Mar 27 00:58 /dev/dm-11
看一下 grid 用戶的組:
# su - grid $ id uid=10000(grid) gid=11000(oinstall) groups=11000(oinstall),11002(asmadmin) ,11003(asmdba),11004(asmoper)
再看一下oracle 用戶的組:
# su - oracle $ id uid=10001(oracle) gid=11000(oinstall) groups=11000(oinstall),11001(dba),11003(asmdba),11005(racdba),11006(backupdba),11007(dgdba),11008(kmdba),11009(oper)
發現 oracle 用戶,根本不在 asmadmin 組里。可能是因為這個原因,沒有辦法訪問數據庫。
請客戶把 oracle 用戶也加入到 asmadmin 中,但是再次執行duplicate 沒有什么變化:
usermod -a -G asmadmin oracle
現在,還需要考慮其它的原因,就是 oracle 可執行文件,是否有權限以 grid 用戶身份運行。
<primary> # su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 19 00:22 /my/oracle/dbhome_1/bin/oracle [root@rachdb001g ~]# su - grid 最終ログイン: 2021/04/01 (木) 16:29:18 JST [grid@rachdb001g ~]$ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 grid oinstall 373409344 Mar 18 03:05 /opt/oracle/grid/12.2.0/grid/bin/oracle [grid@rachdb001g ~]$ <standby> #su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 24 22:19 /my/oracle/dbhome_1/bin/oracle # su - grid $ ls -l $ORACLE_HOME/bin/oracle -rwxr-x--x 1 grid oinstall 373409344 Mar 20 01:56 /opt/oracle/grid/12.2.0/grid/bin/oracle
可以看到,主庫和備庫上的 grid 用戶的 $ORACLE_HOME/bin/oracle 的權限是不一樣的。
一個是: -rwsr-x--x ,一個是 -rwxr-x--x。
需要進行設置:
chown grid:oinstall $GI_HOME/bin/oracle chmod 6751 $GI_HOME/bin/oracle
重新啟動 standby 端的輔助instance,再次執行 duplicate, 已經可以成功執行。
