Pacemaker是Red Hat High Availability Add-on的一部分。在RHEL上進行試用的最簡單方法是從Scientific Linux 或CentOS存儲庫中進行安裝
環境准備
雙節點
注:centos修改主機名
臨時修改:hostname 主機名 --立即生效
永久修改:hostnamectl set-hostname 主機名 --重啟生效
node1 - 192.168.29.246 node2 - 192.168.29.247
系統信息
CentOS Linux release 7.8.2003 (Core)
安裝
所有節點使用yum安裝Pacemaker以及我們將需要的一些其他必要軟件包
yum install pacemaker pcs resource-agents
創建集群
所有節點啟動pcs守護進程並設置開機運行
systemctl start pcsd.service
systemctl enable pcsd.service
設置pcs所需的身份驗證
#所有節點執行 echo 123456 | passwd --stdin hacluster #主節點執行 pcs cluster auth node1 node2 -u hacluster -p 123456 --force
開始創建
pcs cluster setup --force --name pacemaker1 node1 node2
過程信息如下:
[root@node1 ~]# pcs cluster setup --force --name pacemaker1 node1 node2
Destroying cluster on nodes: node1, node2...
node1: Stopping Cluster (pacemaker)...
node2: Stopping Cluster (pacemaker)...
node1: Successfully destroyed cluster
node2: Successfully destroyed cluster
Sending 'pacemaker_remote authkey' to 'node1', 'node2'
node1: successful distribution of the file 'pacemaker_remote authkey'
node2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
node1: Succeeded
node2: Succeeded
Synchronizing pcsd certificates on nodes node1, node2...
node1: Success
node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
node1: Success
node2: Success
啟動集群
任一節點執行
pcs cluster start --all
啟動信息
[root@node1 ~]# pcs cluster start --all
node1: Starting Cluster (corosync)...
node2: Starting Cluster (corosync)...
node1: Starting Cluster (pacemaker)...
node2: Starting Cluster (pacemaker)...
集群設置
禁用Fencing
pcs property set stonith-enabled=false
因為只有兩個節點,仲裁沒有意義,所以我們禁用仲裁
pcs property set no-quorum-policy=ignore
強制集群在單個故障后移動服務
pcs resource defaults migration-threshold=1
添加資源
pcs resource create my_first_svc ocf:heartbeat:Dummy op monitor interval=60s
my_first_svc:服務名
ocf:pacemaker:Dummy:需要使用的腳本(Dummy- 一種用作模板以及對此類指南有用的代理)
op monitor interval = 60s 告訴Pacemaker通過調用代理的Monitor操作每1分鍾檢查一次此服務的運行狀況
查看集群狀態
[root@node1 ~]# pcs status Cluster name: pacemaker1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 14:57:51 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Full list of resources: my_first_svc (ocf::heartbeat:Dummy): Started node1 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
[root@node1 ~]# crm_mon -1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 14:58:46 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Active resources: my_first_svc (ocf::heartbeat:Dummy): Started node1
故障驗證
手動停止服務模擬故障
crm_resource --resource my_first_svc --force-stop
1min后再次查看狀態,可知服務切換到了節點2
[root@node1 ~]# crm_mon -1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 15:29:55 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Active resources: my_first_svc (ocf::heartbeat:Dummy): Started node2 Failed Resource Actions: * my_first_svc_monitor_60000 on node1 'not running' (7): call=7, status=complete, exitreason='No process state file found', last-rc-change='Sat Jun 6 15:29:26 2020', queued=0ms, exec=0ms