1.什么是ONS
ONS(Oracle Notification Service)是Oracle Clusterware 實現FAN Event Push模型的基礎。
在傳統模型中,客戶端需要定期檢索服務器來判斷服務端的狀態,本質上是一個PULL模型。ORACLE10
引入了一種全新的PUSH機制--FAN(Fast Application Notification),當服務端發生某些事件時,服務器
會主動的通知客戶端這種變化,這樣客戶端就能盡早得知服務器端變化。而這種機制就是依賴ONS實現的。
通常使用onsctl命令管理配置ONS,使用onsctl命令之前,需要先配置ONS服務。
2.OSN配置內容
需要注意的是在RAC環境中,使用的是$CRS_HOME下的ONS,而不是$ORACLE_HOME下的ONS。
配置文件位於$CRS_HOME/opmn/conf/ons.config。
[root@rac3 conf]# pwd /opt/ora10g/product/10.2.0/crs_1/opmn/conf [root@rac3 conf]# ls ons.config [root@rac3 conf]# cat ons.config localport=6100 remoteport=6200 loglevel=3 useocr=on
我們對這個文件的參數進行說明:
<1>localport:這個參數代表本地監聽端口,這里的"本地"特指127.0.0.1這個回環地址,用來和運行在本地的客戶端進行通信。
<2>remoteport:這個參數代表的遠程監聽端口,也就是除了127.0.0.1以外的所有本機IP地址,用來和遠程的客戶端進行通信。
<3>loglevel:Oracle允許跟蹤ONS進程的運行,並把日志記錄到本地文件中。這個參數用來定義ONS進程要記錄的日志級別, 從1~9,缺省值為3。
<4>logfile:這個參數和loglevel參數一起使用,用於定義ONS進程日志文件的位置,缺省是 $CRS_HOME/opmn/logs/opmn.log。
<5>nodes和useocr:這兩個參數共同決定了本機的ONS daemon要和哪些節點上的ONS daemon進行通信。
在這些參數中,localport和remoteport兩個參數是必須的。可以通過netstat命令來比較一下這兩個端口的使用方式:
[root@rac3 bin]# netstat -ano|grep 6100 tcp 0 0 127.0.0.1:6100 0.0.0.0:* LISTEN off (0.00/0/0) tcp 0 0 127.0.0.1:6100 127.0.0.1:32852 ESTABLISHED off (0.00/0/0) tcp 0 0 127.0.0.1:32840 127.0.0.1:6100 ESTABLISHED keepalive (7063.32/0/0) tcp 0 0 127.0.0.1:32852 127.0.0.1:6100 ESTABLISHED keepalive (7188.42/0/0) tcp 0 0 127.0.0.1:6100 127.0.0.1:32840 ESTABLISHED off (0.00/0/0) udp 0 0 192.168.2.103:61008 0.0.0.0:* off (0.00/0/0)0/0) [root@rac3 bin]# netstat -ano|grep 6200 tcp 0 0 0.0.0.0:6200 0.0.0.0:* LISTEN off (0.00/0/0) tcp 0 0 192.168.1.103:32836 192.168.1.104:6200 ESTABLISHED off (0.00/0/0)
對比可以看到Oracle在127.0.0.1這個地址上監聽6100這個端口,而在0.0.0.0(即所其他地址)上監聽6200端口,這正好對應了我們/opt/ora10g/product/10.2.0/crs_1/opmn/conf/ons.config中的配置
在這里還需要注意的是useocr參數,該參數取值為ON或OFF。如果useocr是ON,說明與ONS進行通信的遠程節點信息就保存在OCR中,如果是OFF,說明與ONS進行通信的遠程節點信息就取nodes中的配置。
nodes參數值格式: hostname/ip:port[,hostname/ip:port] 例如:nodes=dbs:6200,dbp:6200
當useocr參數為ON時,與ONS進行通信的遠程節點信息就保存在OCR中,那么這個信息就保存在OCR的DATABASE.ONS_HOSTS這個鍵下。
我們可以把這個鍵導出來:
[root@rac3 bin]# ./ocrdump -xml /home/oracle/ons_info.xml -keyname DATABASE.ONS_HOSTS [root@rac3 bin]# cat /home/oracle/ons_info.xml <OCRDUMP> <TIMESTAMP>01/28/2015 10:46:35</TIMESTAMP> <COMMAND>./ocrdump.bin -xml /home/oracle/ons_info.xml -keyname DATABASE.ONS_HOSTS </COMMAND> <KEY> <NAME>DATABASE.ONS_HOSTS</NAME> <VALUE_TYPE>UNDEF</VALUE_TYPE> <VALUE><![CDATA[]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3</NAME> --節點 <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac3]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3.PORT</NAME> --節點對應的端口 <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4</NAME> --節點 <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac4]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4.PORT</NAME> --端口 <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> </KEY> </OCRDUMP>
3.配置ONS
配置ONS時我們可以直接編輯ONS的配置文件來修改配置(useocr=OFF時),如果ONS節點通信的配置信息放在了OCR中(useocr=ON時),可以使用root身份執行racgons命令進行配置。
注意:racgons命令必須用root身份執行,如果使用oracle身份執行這個命令,不會提示任何錯誤信息,但是也不會更改任何配置。
---添加配置:
[root@rac3 bin]# ./racgons add_config rac3:6300 rac4:6300 [root@rac3 bin]# ./ocrdump -xml /home/oracle/ons_info2.xml -keyname DATABASE.ONS_HOSTS [root@rac3 bin]# cat /home/oracle/ons_info2.xml <OCRDUMP> <TIMESTAMP>01/28/2015 10:56:30</TIMESTAMP> <COMMAND>./ocrdump.bin -xml /home/oracle/ons_info2.xml -keyname DATABASE.ONS_HOSTS </COMMAND> <KEY> <NAME>DATABASE.ONS_HOSTS</NAME> <VALUE_TYPE>UNDEF</VALUE_TYPE> <VALUE><![CDATA[]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac3]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3.PORT</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200 6300]]></VALUE> --可以看到增加了6300端口 <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac4]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4.PORT</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200 6300]]></VALUE> --可以看到增加了6300端口 <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> </KEY> </OCRDUMP>
----刪除配置
[root@rac3 bin]# ./racgons remove_config rac3:6300 rac4:6300 racgons: Existing key value on rac3 = 6200 6300. racgons: rac3:6300 removed from OCR. racgons: Existing key value on rac4 = 6200 6300. racgons: rac4:6300 removed from OCR. [root@rac3 bin]# ./ocrdump -xml /home/oracle/ons_info3.xml -keyname DATABASE.ONS_HOSTS [root@rac3 bin]# cat /home/oracle/ons_info3.xml <OCRDUMP> <TIMESTAMP>01/28/2015 11:01:13</TIMESTAMP> <COMMAND>./ocrdump.bin -xml /home/oracle/ons_info3.xml -keyname DATABASE.ONS_HOSTS </COMMAND> <KEY> <NAME>DATABASE.ONS_HOSTS</NAME> <VALUE_TYPE>UNDEF</VALUE_TYPE> <VALUE><![CDATA[]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac3]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac3.PORT</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200 ]]></VALUE> --可以看到6300端口已被刪除 <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[rac4]]></VALUE> <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> <KEY> <NAME>DATABASE.ONS_HOSTS.rac4.PORT</NAME> <VALUE_TYPE>ORATEXT</VALUE_TYPE> <VALUE><![CDATA[6200 ]]></VALUE> --可以看到6300端口已被刪除 <USER_PERMISSION>PROCR_ALL_ACCESS</USER_PERMISSION> <GROUP_PERMISSION>PROCR_READ</GROUP_PERMISSION> <OTHER_PERMISSION>PROCR_READ</OTHER_PERMISSION> <USER_NAME>oracle</USER_NAME> <GROUP_NAME>oinstall</GROUP_NAME> </KEY> </KEY> </KEY> </OCRDUMP>
4.onsctl命令
用onsctl命令可以啟動、停止、調試ONS,並重新載入配置文件,其命令格式如下:
[root@rac3 bin]# ./onsctl -help usage: ./onsctl start|stop|ping|reconfig|debug start - Start opmn only. stop - Stop ons daemon ping - Test to see if ons daemon is running debug - Display debug information for the ons daemon reconfig - Reload the ons configuration help - Print a short syntax description (this). detailed - Print a verbose syntax description.
注意:ONS進程運行,並不一定代表ONS正常工作,需要使用ping命令來確認。
<1>在OS級別查看進程狀態
[root@rac3 bin]# ps -ef|grep ons |grep -v grep oracle 27813 1 0 10:31 ? 00:00:00 /opt/ora10g/product/10.2.0/crs_1/opmn/bin/ons -d oracle 27814 27813 0 10:31 ? 00:00:00 /opt/ora10g/product/10.2.0/crs_1/opmn/bin/ons -d
從輸出信息可見ONS進程正常運行。
<2>確認ONS服務狀態
[root@rac3 bin]# ./onsctl ping Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 ons is running ...
從輸出信息可見ONS進程正常運行。
<3>停止ons服務
[root@rac3 bin]# ./onsctl stop onsctl: shutting down ons daemon ... Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 [root@rac3 bin]# [root@rac3 bin]# ./onsctl ping Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 ons is not running ... ---從這里看確認停止成功
<4>啟動ons服務
[root@rac3 bin]# ./onsctl start Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 onsctl: ons started --啟動成功 [root@rac3 bin]# [root@rac3 bin]# ./onsctl ping Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 ons is running ... --從這里看確認啟動成功
<5>使用debug選項查看詳細信息
[root@rac3 bin]# ./onsctl debug Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 HTTP/1.1 200 OK Content-Length: 1355 Content-Type: text/html Response: ======== ONS ======== Listeners: NAME BIND ADDRESS PORT FLAGS SOCKET ------- --------------- ----- -------- ------ Local 127.000.000.001 6100 00000142 7 Remote 192.168.001.103 6200 00000101 8 Request No listener Server connections: -----該命令最有意義的是能夠顯示所有連接。 ID IP PORT FLAGS SENDQ WORKER BUSY SUBS ---------- --------------- ----- -------- ---------- -------- ------ ----- 1 192.168.001.104 6200 00010005 0 1 0 Client connections: ID IP PORT FLAGS SENDQ WORKER BUSY SUBS ---------- --------------- ----- -------- ---------- -------- ------ ----- Pending connections: ID IP PORT FLAGS SENDQ WORKER BUSY SUBS ---------- --------------- ----- -------- ---------- -------- ------ ----- 0 127.000.000.001 6100 00000812 0 1 0 0 127.000.000.001 6100 00000812 0 1 0 0 127.000.000.001 6100 00020812 0 1 0 Worker Ticket: 0/0, Idle: 360 THREAD FLAGS -------- -------- f7f86ba0 00000012 f6dd1ba0 00000012 f63d0ba0 00000012 Resources: Notifications: Received: 0, in Receive Q: 0, Processed: 0, in Process Q: 0 Pools: Message: 24/25 (1), Link: 25/25 (1), Subscription: 0/0 (0)
##===========================================================
延伸:
在對以上ons進行配置測試后,使用crs_stat -t 命令發現集群中一個節點 ons啟動不起來
[oracle@rac3 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE OFFLINE ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application ONLINE ONLINE rac4 ora....b1.inst application ONLINE ONLINE rac3 ora....b2.inst application ONLINE ONLINE rac4
--查看日志
[oracle@rac3 racg]$ tail -f ora.rac3.ons.log .......................................... RCV: Permission denied Communication error with the OPMN server local port. Check the OPMN log files RCV: Permission denied Communication error with the OPMN server loca 2015-01-28 13:34:25.867: [ RACG][2540408064] [29681][2540408064][ora.rac3.ons]: l port. Check the OPMN log files RCV: Permission denied -----一直提示權限被拒絕 Communication error with the OPMN server local port. Check the OPMN log files Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 o 2015-01-28 13:34:25.867: [ RACG][2540408064] [29681][2540408064][ora.rac3.ons]: nscfg[1] {node = rac4, port = 6200} Adding remote host rac4:6200 onsctl: ons failed to start --導致ons啟動失敗,但onsctl ping顯示ons正在運行 2015-01-28 13:34:26.077: [ RACG][2540408064] [29681][2540408064][ora.rac3.ons]: RCV: Permission denied Communication error with the OPMN server local port. Check the OPMN log files
--但是確認ons服務已啟動
[root@rac3 bin]# ./onsctl ping Number of onsconfiguration retrieved, numcfg = 2 onscfg[0] {node = rac3, port = 6200} Adding remote host rac3:6200 onscfg[1] {node = rac4, port = 6 2015-01-28 13:34:26.077: [ RACG][2540408064] [29681][2540408064][ora.rac3.ons]: 200} Adding remote host rac4:6200 ons is not running ...
重新./onsctl stop 后 ./onsctl start也可以正常關閉和啟動,但日志里看到的都是啟動不起來
--單獨啟動的時候
[oracle@rac3 ~]$ crs_start ora.rac3.ons Attempting to start `ora.rac1.ons` on member `rac3` Start of `ora.rac3.ons` on member `rac3` failed. rac4 : CRS-1019: Resource ora.rac3.ons (application) cannot run on rac4
驗證了ons的配置權限也沒有發現問題,重啟了虛擬機嘗試,發現ons在兩個節點正常啟動,問題解決。
現在懷疑可能是權限問題沒有檢查到或ons進程僵死,啟動新的能夠啟動,日志里還是報錯信息。
(一般情況下,暫時的關閉和啟動ons資源對系統影響不是太大,因為該資源主要和load balance 、 failover 有關)
[oracle@rac3 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application ONLINE ONLINE rac4 ora....b1.inst application ONLINE ONLINE rac3 ora....b2.inst application ONLINE ONLINE rac4
類似問題itpub上的帖子:http://www.itpub.net/thread-1283253-1-1.html
ps -ef|grep ons
致謝:本文檔參考了張曉明<<大話Oracle RAC>>