在centos 7.4上安裝oracle rac 11.2.0.4 報錯及相關解決
$ cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
1 udev綁定共享磁盤
之前在centos 6上面的命令/sbin/scsi_id 在7上面沒有,替換成/usr/lib/udev/scsi_id
--沒有分區 for i in b c d e f g; do echo "KERNEL==\"sd*\", SUBSYSTEM==\"block\", PROGRAM==\"/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/\$name\", RESULT==\"`/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sd$i`\", NAME=\"asm-disk$i\", OWNER=\"grid\", GROUP=\"asmadmin\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules done
[root@rac01 ~]# cat /etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sd*", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/$name", RESULT=="36000c29ea85262d4a23086fbce428b09", NAME="asm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660"
6和7有些區別,不然會報錯
SYMLINK+=\"asm-disk$i\"
NAME=\"asm-disk$i\"
[root@rac01 ~]# ls -l /dev/asm* Jul 31 16:31:04 rac01 systemd-udevd[664]: unknown key 'BUS' in /etc/udev/rules.d/99-oracle-asmdevices.rules:11 Jul 31 16:31:04 rac01 systemd-udevd[664]: invalid rule '/etc/udev/rules.d/99-oracle-asmdevices.rules:11' Jul 31 16:31:04 rac01 systemd-udevd[664]: unknown key 'BUS' in /etc/udev/rules.d/99-oracle-asmdevices.rules:12 Jul 31 16:31:04 rac01 systemd-udevd[664]: invalid rule '/etc/udev/rules.d/99-oracle-asmdevices.rules:12' Jul 31 16:44:37 rac01 systemd-udevd[7121]: NAME="asm-diskb" ignored, kernel device nodes can not be renamed; please fix it in /etc/udev/rules.d/99-oracle-asmdevices.rules:1 Jul 31 16:44:41 rac01 systemd-udevd[7133]: NAME="asm-diskc" ignored, kernel device nodes can not be renamed; please fix it in /etc/udev/rules.d/99-oracle-asmdevices.rules:2
重新加載分區
/sbin/partprobe /dev/sdb
[root@rac01 ~]# /usr/lib/udev/scsi_id -g -u /dev/sdb 36000c29ea85262d4a23086fbce428b09
啟動udev
/usr/sbin/udevadm control --reload-rules systemctl status systemd-udevd.service systemctl enable systemd-udevd.service
[root@rac01 ~]# /sbin/udevadm trigger --type=devices --action=change [root@rac01 ~]# ll /dev/asm-disk* lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskb -> sdb lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskc -> sdc lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskd -> sdd lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diske -> sde lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskf -> sdf lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskg -> sdg
2 grid安裝時候,執行root腳本報錯
--節點1
[root@rac01 ~]# /u01/app/oraInventory/orainstRoot.sh Changing permissions of /u01/app/oraInventory. Adding read,write permissions for group. Removing read,write,execute permissions for world. Changing groupname of /u01/app/oraInventory to oinstall. The execution of the script is complete. [root@rac01 ~]# /u01/app/11.2.0/grid/root.sh Performing root user operation for Oracle 11g The following environment variables are set as: ORACLE_OWNER= grid ORACLE_HOME= /u01/app/11.2.0/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: Copying dbhome to /usr/local/bin ... Copying oraenv to /usr/local/bin ... Copying coraenv to /usr/local/bin ... Creating /etc/oratab file... Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root script. Now product-specific root actions will be performed. Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params Creating trace directory User ignored Prerequisites during installation Installing Trace File Analyzer OLR initialization - successful root wallet root wallet cert root cert export peer wallet profile reader wallet pa wallet peer wallet keys pa wallet keys peer cert request pa cert request peer cert pa cert peer root cert TP profile reader root cert TP pa root cert TP peer pa cert TP pa peer cert TP profile reader pa cert TP profile reader peer cert TP peer user cert pa user cert Adding Clusterware entries to inittab ohasd failed to start Failed to start the Clusterware. Last 20 lines of the alert log follow: 2019-08-01 09:35:59.951: [client(14411)]CRS-2101:The OLR was formatted using version 3. ^CINT at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1446. /u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed Oracle root script execution aborted!
一開始以為是共享磁盤權限問題
[root@rac01 ~]# ll /dev/asm-disk* lrwxrwxrwx 1 root root 3 Aug 1 09:07 /dev/asm-diskb -> sdb ##修改 [root@rac01 ~]# chown grid:asmadmin /dev/asm-disk* ##並沒有作用
參考https://blog.csdn.net/DBAngelica/article/details/85002591
[root@rac01 ~]# touch /usr/lib/systemd/system/ohasd.service [root@rac01 ~]# vim /usr/lib/systemd/system/ohasd.service [Unit] Description=Oracle High Availability Services After=syslog.target [Service] ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple Restart=always [Install] WantedBy=multi-user.target [root@rac01 ~]# systemctl daemon-reload [root@rac01 ~]# systemctl enable ohasd.service Created symlink from /etc/systemd/system/multi-user.target.wants/ohasd.service to /usr/lib/systemd/system/ohasd.service. [root@rac01 ~]# systemctl start ohasd.service [root@rac01 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-08-01 10:53:38 CST; 6s ago Main PID: 18621 (init.ohasd) CGroup: /system.slice/ohasd.service └─18621 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple Aug 01 10:53:38 rac01 systemd[1]: Started Oracle High Availability Services. Aug 01 10:53:38 rac01 systemd[1]: Starting Oracle High Availability Services... [root@rac01 ~]# /u01/app/11.2.0/grid/root.sh CRS-4266: Voting file(s) successfully replaced ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 61e053dbaca94f40bfa468e31c9c927f (/dev/asm-diskb) [OCR] 2. ONLINE 6b25d06268b84fe9bfc6125298d94018 (/dev/asm-diskd) [OCR] 3. ONLINE b1fd0f59a3474f92bf0b2d3344fe91cc (/dev/asm-diskc) [OCR] Located 3 voting disk(s). CRS-2672: Attempting to start 'ora.asm' on 'rac01' CRS-2676: Start of 'ora.asm' on 'rac01' succeeded CRS-2672: Attempting to start 'ora.OCR.dg' on 'rac01' CRS-2676: Start of 'ora.OCR.dg' on 'rac01' succeeded Configure Oracle Grid Infrastructure for a Cluster ... succeeded
--節點2執行報錯
注意: 為了避免其余節點遇到這種報錯,可以在root.sh執行過程中,待/etc/init.d/目錄下生成了init.ohasd 文件后執行systemctl start ohasd.service 啟動ohasd服務即可。 若沒有/etc/init.d/init.ohasd文件 systemctl start ohasd.service 則會啟動失敗。 [root@rac02 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: failed (Result: start-limit) since Thu 2019-08-01 11:03:58 CST; 3s ago Process: 22754 ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple (code=exited, status=203/EXEC) Main PID: 22754 (code=exited, status=203/EXEC) Aug 01 11:03:57 rac02 systemd[1]: Unit ohasd.service entered failed state. Aug 01 11:03:57 rac02 systemd[1]: ohasd.service failed. Aug 01 11:03:58 rac02 systemd[1]: ohasd.service holdoff time over, scheduling restart. Aug 01 11:03:58 rac02 systemd[1]: start request repeated too quickly for ohasd.service Aug 01 11:03:58 rac02 systemd[1]: Failed to start Oracle High Availability Services. Aug 01 11:03:58 rac02 systemd[1]: Unit ohasd.service entered failed state. Aug 01 11:03:58 rac02 systemd[1]: ohasd.service failed.
錯誤日志
[root@rac02 ~]# ll /etc/init.d/init.ohasd ls: cannot access /etc/init.d/init.ohasd: No such file or directory [root@rac02 ~]# ll /etc/init.d/init.ohasd -rwxr-xr-x 1 root root 8782 Aug 1 11:06 /etc/init.d/init.ohasd [root@rac02 ~]# systemctl start ohasd.service [root@rac02 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-08-01 11:06:20 CST; 4s ago Main PID: 24186 (init.ohasd) CGroup: /system.slice/ohasd.service ├─24186 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple └─24211 /bin/sleep 10 Aug 01 11:06:20 rac02 systemd[1]: Started Oracle High Availability Services. Aug 01 11:06:20 rac02 systemd[1]: Starting Oracle High Availability Services... [root@rac01 rac01]# tail -n 100 -f /u01/app/11.2.0/grid/log/rac01/alertrac01.log 2019-08-01 14:16:30.453: [cssd(21789)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac01 rac02 . [root@rac02 ~]# tail -n 100 -f /u01/app/11.2.0/grid/log/rac02/alertrac02.log The execution of the script is complete. 2019-08-01 14:15:48.037: [ohasd(3604)]CRS-2112:The OLR service started on node rac02. 2019-08-01 14:15:48.059: [ohasd(3604)]CRS-1301:Oracle High Availability Service started on node rac02. 2019-08-01 14:15:48.060: [ohasd(3604)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2019-08-01 14:15:48.545: [/u01/app/11.2.0/grid/bin/oraagent.bin(6497)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:15:51.622: [/u01/app/11.2.0/grid/bin/orarootagent.bin(6501)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 2019-08-01 14:15:53.823: [gpnpd(6592)]CRS-2328:GPNPD started on node rac02. 2019-08-01 14:15:56.234: [cssd(6658)]CRS-1713:CSSD daemon is started in clustered mode 2019-08-01 14:15:58.006: [ohasd(3604)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2019-08-01 14:15:58.006: [ohasd(3604)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2019-08-01 14:16:21.832: [cssd(6658)]CRS-1707:Lease acquisition for node rac02 number 2 completed 2019-08-01 14:16:23.138: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskc; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:23.140: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskd; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:23.146: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskb; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:29.466: [cssd(6658)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac01 rac02 . 2019-08-01 14:16:31.434: [ctssd(7290)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac01. 2019-08-01 14:16:31.435: [ctssd(7290)]CRS-2401:The Cluster Time Synchronization Service started on host rac02. 2019-08-01 14:16:33.170: [ohasd(3604)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2019-08-01 14:16:33.171: [ohasd(3604)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2019-08-01 14:17:30.167: [/u01/app/11.2.0/grid/bin/orarootagent.bin(6603)]CRS-5818:Aborted command 'start' for resource 'ora.ctssd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/rac02/agent/ohasd/orarootagent_root/orarootagent_root.log. 2019-08-01 14:17:34.169: [ohasd(3604)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.ctssd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/rac02/ohasd/ohasd.log. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.asm' failed to start automatically. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.crsd' failed to start automatically. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.evmd' failed to start automatically. 2019-08-01 14:17:51.734: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:04.174: [ohasd(3604)]CRS-2765:Resource 'ora.ctssd' has failed on server 'rac02'. 2019-08-01 14:19:06.776: [ctssd(8408)]CRS-2401:The Cluster Time Synchronization Service started on host rac02. 2019-08-01 14:19:06.776: [ctssd(8408)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac01. 2019-08-01 14:19:07.533: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:13.266: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:36.864: [crsd(8918)]CRS-1012:The OCR service started on node rac02.
/u01/app/11.2.0/grid/log/rac01/agent/ohasd/oraagent_grid/oraagent_grid.log 019-08-01 14:40:53.671: [ora.gipcd][4109874944]{0:0:156} [check] clsdmc_respget return: status=0, ecode=0 2019-08-01 14:41:18.893: [ CRSCOMM][4152755968] IpcC: IPC client connection 18 to member 0 has been removed 2019-08-01 14:41:18.893: [CLSFRAME][4152755968] Removing IPC Member:{Relative|Node:0|Process:0|Type:2} 2019-08-01 14:41:18.893: [CLSFRAME][4152755968] Disconnected from OHASD:rac01 process: {Relative|Node:0|Process:0|Type:2} 2019-08-01 14:41:18.894: [ AGENT][4142249728]{0:13:10} {0:13:10} Created alert : (:CRSAGF00117:) : Disconnected from server, Agent is shutting down. 2019-08-01 14:41:18.894: [ AGFW][4142249728]{0:13:10} Agent is exiting with exit code: 1 /u01/app/11.2.0/grid/log/rac01/agent/ohasd/oracssdagent_root/oracssdagent_root.log 2019-08-01 15:02:53.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:02:58.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:03:03.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:03:08.929: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member
在節點2執行了節點1同樣的方法,但root.sh始終執行不成功。
包括手動執行了
# /bin/dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
依然不行。。。