說明:
測試的RAC只2個節點,整體步驟來自ORACLE 官方文檔:
https://docs.oracle.com/cd/E11882_01/rac.112/e41960/adddelunix.htm#RACAD7358
步驟 3從cluster中刪除節點來自ORACLE 官方文檔(Deleting a Cluster Node on Linux and UNIX Systems):
https://docs.oracle.com/cd/E11882_01/rac.112/e41959/adddelclusterware.htm#CWADD90992
注意:
一共兩套實驗,一套是正常刪除ogg數據庫;一套異常刪除(orcl),即RAC 節點2服務器DOWN掉極端情況,集群資源RAC2 也相應全部停掉。
RAC2 down 掉只需要執行以下步驟刪除節點2:
第一步:1.1或者1.2
第二步:1.3 刪除后驗證
第三步:2.2.3
第四步:3.2.2、3.3.3、3.3.4
如果被刪除節點2情況介於兩種之間,還有活動的集群資源,需要按照一步一步的手動執行文檔中所有步驟(除了3.2.2)。
實驗環境情況如下:
實驗 |
節點名稱 |
數據庫實例名 |
操作系統 |
數據庫版本 |
正常情況刪除 |
rac1/rax2 |
orcl1/orcl2 |
Linux 6.X |
oracle11.2.0.4 |
異常情況刪除 |
racdg1/racdg2 |
ogg1/ogg2 |
Linux 6.X |
Oracle11.2.0.4 |
grid:GRID_HOME 名稱為 ORACLE_HOME 路徑為:/u01/app/11.2.0/grid
oracle:ORACLE_HOME 路徑為:/u01/app/oracle/product/11.2.0/dbhome_1
操作大致步驟
刪除ORACLE rac 實例
刪除ORACLE rac軟件
從cluster中刪除節點
1、刪除ORACLE rac 實例
1.1dbca圖形界面刪除
查看刪除前實例線程狀態:
正常庫實驗:
異常庫實驗:
Oracle 用戶
dbca
例如節點2服務器壞掉,從節點1上執行dbca
以下為停止實例命令(服務器節點2 DOWN掉直接刪實例):
$ srvctl stop instance -d db_unique_name -n node_name
$ srvctl relocate server -n node_name -g Free
大致步驟如下:
后面繼續……
1.2 dbca靜默刪除
官方命令模板:
dbca -silent -deleteInstance [-nodeList node_name] -gdbName gdb_name -instanceName instance_name -sysDBAUserName sysdba -sysDBAPassword password
如刪除節點2實例:
在好的節點上執行:
正常的刪除:
報錯如下:
查看SCAN_LISTENER 在節點2運行
處理:我是嘗試重啟了兩台服務器(虛擬機簡單粗暴先全部關閉,再啟動節點1,最后啟動節點2),可以嘗試在節點1上啟停SCAN_listener, listener,讓SCAN_listener 運行在節點1。
如下刪除:
dbca -silent -deleteInstance -nodeList racdg2 -gdbName ogg -instanceName ogg2 -sysDBAUserName sys -sysDBAPassword oracle
異常實驗刪除:
dbca -silent -deleteInstance -nodeList rac2 -gdbName orcl -instanceName orcl2 -sysDBAUserName sys -sysDBAPassword Oracle123
1.3刪除后驗證
查看活動的實例:
正產庫測試:
select thread#,status,instance from v$thread;
異常庫測試:
select thread#,status,instance from v$thread;
如果還有節點2的redo log ,請使用以下命令:
ALTER DATABASE DISABLE THREAD 2;
驗證OCR中 數據庫信息
srvctl config database -d db_unique_name
例如:
srvctl config database -d orcl
2. 卸載ORACLE rac軟件
2.1停止和刪除監聽
異常實驗不用執行以下步驟:
srvctl disable listener -l listener_name -n name_of_node_to_delete
srvctl stop listener -l listener_name -n name_of_node_to_delete
執行以下:
srvctl disable listener -l listener -n racdg2
srvctl stop listener -l listener -n racdg2
2.2 更新節點集群列表
1) 在故障節點 Oracle 用戶 $ORACLE_HOME/oui/bin 下運行(正常刪除節點實驗)
官方:
$./runInstaller -updateNodeList ORACLE_HOME=Oracle_home_location "CLUSTER_NODES={name_of_node_to_delete}" –local
例如:
$ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={racdg2}" -local
2) 刪除Oracle RAC軟件(oracle 用戶-正常刪除節點實驗):
對於共享home,請分離節點,而不是通過從$ORACLE_HOME/oui/bin要刪除的每個節點上的目錄運行以下命令來卸載該節點:
./runInstaller -detachHome ORACLE_HOME=$ORACLE_HOME
對於非共享home,請通過運行以下命令從正在刪除的節點中卸載Oracle主目錄:
$ORACLE_HOME/deinstall/deinstall -local
3) 在所有的保留節點Oracle 用戶 $ORACLE_HOME/oui/bin 下運行以下命令來更新這些節點的清單,並指定逗號分隔的其余節點名稱列表(正常,異常都執行):
官方:
$./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={remaining_node_list}"
所有的保留節點執行:
我的就只剩一個節點例如:
cd $ORACLE_HOME/oui/bin
$./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={rac1,rac3……}"
正常刪除:
異常刪除:
./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={rac1}"
3. 從cluster中刪除節點
來自官方文檔:https://docs.oracle.com/cd/E11882_01/rac.112/e41959/adddelclusterware.htm#CWADD90992
3.1查看節點運行情況
ROOT 或者grid 執行
olsnodes -s -t
正常刪除:
異常刪除:
如果要刪除的節點為 pinned 狀態,請ROOT手工執行以下命令。
官方文檔如下:
提別提醒:很多網絡上資料不正確,如果Unpinned(不固定的),根本不需要執行unpin 命令,不能盲目相信網上資料。
本次正常和異常實驗都不需要執行以下命令。
crsctl unpin css -n <node1>
例如:crsctl unpin css -n rac2
/u01/app/11.2.0/grid/bin/crsctl unpin css -n rac2
3.2刪除節點
Disable Cluster 以及守護進程,利用rootcrs.pl腳本在被刪除的節點上root 用戶Grid_home/crs/install 目錄下執行:
3.2.1正常情況刪除步驟
3.2.1.1卸載GI安裝目錄(正常刪除實驗)
ROOT 執行:
/u01/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force
如果要刪除多個節點,需要每個節點執行以上命令。
如果全部刪除節點,在最后一個節點執行以下命令:
/u01/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force -lastnode
-lastnode只能用於全部刪除。
3.2.1.2刪除節點(正常和異常都需要執行)
注意:以刪除下語句ROOT執行刪除節點
crsctl delete node -n node_to_be_deleted
執行:
/u01/app/11.2.0/grid/bin/crsctl delete node -n racdg2
3.2.1.3更新集群列表(正常實驗操作)
Grid安裝用戶更新集群列表:
Grid_home/oui/bin下:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={node_to_be_deleted}" CRS=TRUE -silent -local
以下語句:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={racdg2}" CRS=TRUE -silent -local
繼續后續操作:
注意:
官方文檔說的是:
再繼續之前,查看 inventory.xml 文件確保該文件沒有被更新(官方文檔說的有點含糊,個人測試的是節點1的不更新,被刪除的那個節點文件更新了),如果節點1被更新了,后續卸載會卸載整個集群安裝目錄。
more /u01/app/oraInventory/ContentsXML/inventory.xml
節點1:
節點2:
卸載GI安裝目錄
共享目錄:
在被刪除節點下執行:
$ Grid_home/perl/bin/perl Grid_home/crs/install/rootcrs.pl -deconfig
$ ./runInstaller -detachHome ORACLE_HOME=Grid_home -silent -local
手動刪除以下文件:
rm -fr /etc/oraInst.loc
rm -fr /etc/oratab
rm -fr /etc/oracle/
rm -fr /opt/ORCLfmap/
rm -fr /u01/app/oraInventory/
非共享目錄(大多數安裝都是非共享目錄):
$ Grid_home/deinstall/deinstall -local
在任何運行正常,不刪除的節點上運行以下命令更新CRS 信息:
Grid_home/oui/bin下:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={remaining_nodes_list}" CRS=TRUE -silent
操作記錄如下:
[grid@racdg2 ~]$ /u01/app/11.2.0/grid/deinstall/deinstall -local
Checking for required files and bootstrapping ...
Please wait ...
Location of logs /tmp/deinstall2018-03-02_05-41-36AM/logs/
############ ORACLE DEINSTALL & DECONFIG TOOL START ############
######################### CHECK OPERATION START #########################
## [START] Install check configuration ##
Checking for existence of the Oracle home location /u01/app/11.2.0/grid
Oracle Home type selected for deinstall is: Oracle Grid Infrastructure for a Cluster
Oracle Base selected for deinstall is: /u01/app/grid
Checking for existence of central inventory location /u01/app/oraInventory
Checking for existence of the Oracle Grid Infrastructure home
The following nodes are part of this cluster: racdg2
Checking for sufficient temp space availability on node(s) : 'racdg2'
## [END] Install check configuration ##
Traces log file: /tmp/deinstall2018-03-02_05-41-36AM/logs//crsdc.log
Enter an address or the name of the virtual IP used on node "racdg2"[racdg2-vip]
> 以下都是回車
The following information can be collected by running "/sbin/ifconfig -a" on node "racdg2"
Enter the IP netmask of Virtual IP "172.16.10.223" on node "racdg2"[255.255.255.0]
>
Enter the network interface name on which the virtual IP address "172.16.10.223" is active
>
Enter an address or the name of the virtual IP[]
>
Network Configuration check config START
Network de-configuration trace file location: /tmp/deinstall2018-03-02_05-41-36AM/logs/netdc_check2018-03-02_05-43-22-AM.log
Specify all RAC listeners (do not include SCAN listener) that are to be de-configured [LISTENER,LISTENER_SCAN1]:
Network Configuration check config END
Asm Check Configuration START
ASM de-configuration trace file location: /tmp/deinstall2018-03-02_05-41-36AM/logs/asmcadc_check2018-03-02_05-43-34-AM.log
######################### CHECK OPERATION END #########################
####################### CHECK OPERATION SUMMARY #######################
Oracle Grid Infrastructure Home is:
The cluster node(s) on which the Oracle home deinstallation will be performed are:racdg2
Since -local option has been specified, the Oracle home will be deinstalled only on the local node, 'racdg2', and the global configuration will be removed.
Oracle Home selected for deinstall is: /u01/app/11.2.0/grid
Inventory Location where the Oracle home registered is: /u01/app/oraInventory
Following RAC listener(s) will be de-configured: LISTENER,LISTENER_SCAN1
Option -local will not modify any ASM configuration.
Do you want to continue (y - yes, n - no)? [n]:y 繼續 ,不修改ASM 配置
A log of this session will be written to: '/tmp/deinstall2018-03-02_05-41-36AM/logs/deinstall_deconfig2018-03-02_05-41-43-AM.out'
Any error messages from this session will be written to: '/tmp/deinstall2018-03-02_05-41-36AM/logs/deinstall_deconfig2018-03-02_05-41-43-AM.err'
######################## CLEAN OPERATION START ########################
ASM de-configuration trace file location: /tmp/deinstall2018-03-02_05-41-36AM/logs/asmcadc_clean2018-03-02_05-44-28-AM.log
ASM Clean Configuration END
Network Configuration clean config START
Network de-configuration trace file location: /tmp/deinstall2018-03-02_05-41-36AM/logs/netdc_clean2018-03-02_05-44-28-AM.log
De-configuring RAC listener(s): LISTENER,LISTENER_SCAN1
De-configuring listener: LISTENER
Stopping listener on node "racdg2": LISTENER
Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.
De-configuring listener: LISTENER_SCAN1
Stopping listener on node "racdg2": LISTENER_SCAN1
Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.
De-configuring Naming Methods configuration file...
Naming Methods configuration file de-configured successfully.
De-configuring backup files...
Backup files de-configured successfully.
The network configuration has been cleaned up successfully.
Network Configuration clean config END
---------------------------------------->
The deconfig command below can be executed in parallel on all the remote nodes. Execute the command on the local node after the execution completes on all the remote nodes.
Run the following command as the root user or the administrator on node "racdg2".
/tmp/deinstall2018-03-02_05-41-36AM/perl/bin/perl -I/tmp/deinstall2018-03-02_05-41-36AM/perl/lib -I/tmp/deinstall2018-03-02_05-41-36AM/crs/install /tmp/deinstall2018-03-02_05-41-36AM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2018-03-02_05-41-36AM/response/deinstall_Ora11g_gridinfrahome1.rsp"
Press Enter after you finish running the above commands
<----------------------------------------
Remove the directory: /tmp/deinstall2018-03-02_05-41-36AM on node:
Setting the force flag to false
Setting the force flag to cleanup the Oracle Base
Oracle Universal Installer clean START
Detach Oracle home '/u01/app/11.2.0/grid' from the central inventory on the local node : Done
Delete directory '/u01/app/11.2.0/grid' on the local node : Done
Delete directory '/u01/app/oraInventory' on the local node : Done
Delete directory '/u01/app/grid' on the local node : Done
Oracle Universal Installer cleanup was successful.
Oracle Universal Installer clean END
## [START] Oracle install clean ##
Clean install operation removing temporary directory '/tmp/deinstall2018-03-02_05-41-36AM' on node 'racdg2'
## [END] Oracle install clean ##
######################### CLEAN OPERATION END #########################
####################### CLEAN OPERATION SUMMARY #######################
Following RAC listener(s) were de-configured successfully: LISTENER,LISTENER_SCAN1
Oracle Clusterware is stopped and successfully de-configured on node "racdg2"
Oracle Clusterware is stopped and de-configured successfully.
Successfully detached Oracle home '/u01/app/11.2.0/grid' from the central inventory on the local node.
Successfully deleted directory '/u01/app/11.2.0/grid' on the local node.
Successfully deleted directory '/u01/app/oraInventory' on the local node.
Successfully deleted directory '/u01/app/grid' on the local node.
Oracle Universal Installer cleanup was successful.
Run 'rm -rf /etc/oraInst.loc' as root on node(s) 'racdg2' at the end of the session.
Run 'rm -rf /opt/ORCLfmap' as root on node(s) 'racdg2' at the end of the session.
Run 'rm -rf /etc/oratab' as root on node(s) 'racdg2' at the end of the session.
Oracle deinstall tool successfully cleaned up temporary directories.
#######################################################################
############# ORACLE DEINSTALL & DECONFIG TOOL END #############
執行腳本語句:
完成后的圖:
例如所有保留的節點上grid執行以下命令:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES=
{rac1,rac3……}" CRS=TRUE -silent
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=
{racdg1}" CRS=TRUE -silent
3.2.2異常刪除步驟
注意:如果節點已經DOWN 掉或者不可訪問等情況,不能進行以上刪除該節點命令在正常節點上執行以下命令:
crsctl status res -t
停止以及刪除VIP 資源。
停止節點2的VIP:(VIP_name 為/etc/hosts里的名稱 rac2-vip)
srvctl stop vip -i vip_name -f
ROOT用戶:
/u01/app/11.2.0/grid/bin/srvctl stop vip -i rac2-vip -f
srvctl remove vip -i vip_name -f
/u01/app/11.2.0/grid/bin/srvctl remove vip -i rac2-vip -f
查看VIP:
/u01/app/11.2.0/grid/bin/crsctl status res -t
只有節點1的VIP。
在正常節點上root執行刪除節點2命令:
刪除異常情況的節點:
crsctl delete node -n node_to_be_deleted
/u01/app/11.2.0/grid/bin/crsctl delete node -n rac2
注意:
官方文檔說的再繼續之前,查看 inventory.xml 文件確保該文件沒有被更新(官方文檔說的含糊,個人測試的是有關節點1的信息不能被更新),如果被更新了,后續卸載會卸載整個集群安裝目錄。
more /u01/app/oraInventory/ContentsXML/inventory.xml
未做任何操作之前是:
操作完后是:
卸載GI安裝目錄(服務器不能訪問,不用)
由於服務器不能訪問或者起不來,不用卸載,直接進行下一步
更新集群資源信息
Grid用戶在任何運行正常,所有保留的節點上運行以下命令:
Grid_home/oui/bin下:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={remaining_nodes_list}" CRS=TRUE -silent
例如:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={rac1,rac3……}" CRS=TRUE -silent
操作如下:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={rac1}" CRS=TRUE -silent
grid環境 Grid_home 名稱為$ORACLE_HOME
3.3CVU驗證
$ cluvfy stage -post nodedel -n node_list [-verbose]
正常刪除驗證:
異常刪除驗證:
后面可以自行驗證保留的集群資源,以及實例狀態是否正常。
3.4后續說明
如果DOWN掉的服務器后面修好了,集群資源能夠正常起的起來,還有+ASM2實例,需要完全卸載參照前面正常情況下GI卸載;
如果DOWN掉的服務器后面修好了,你想把之前刪除的實例添加進去,需要先添加VIP,后(圖形化或者靜默)添加實例,DOWN掉的服務器並沒有卸載掉ORACLE和GI軟件。