1.環境
OS: CentOS release 6.4 (Final) DB: postgresql 9.3.6 pgpool服務器: pgpool 172.16.0.240 數據庫主服務器:master 172.16.0.241 數據庫從服務器:slave 172.16.0.242
其中主從數據庫使用的流復制,並且已經配置完畢,新配置的pgpool使用postgres用戶進行管理。新架構使用的主備模式外加流復制,此架構支持流復制、負載均衡、故障恢復,不支持復制和並行查詢,主庫可以支持讀寫,從庫只讀。
2.pgpool安裝
[root@pgpool opt]# yum install -y http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-redhat93-9.3-1.noarch.rpm [root@pgpool opt]# yum install postgresql93-devel [root@pgpool opt]# wget http://www.pgpool.net/download.php?f=pgpool-II-3.4.2.tar.gz [root@pgpool opt]# mv download.php\?f\=pgpool-II-3.4.2.tar.gz pgpool-II-3.4.2.tar.gz [root@pgpool opt]# tar -zxvf pgpool-II-3.4.2.tar.gz [root@pgpool opt]# cd pgpool-II-3.4.2 [root@pgpool pgpool-II-3.4.2]# ./configure --with-pgsql=/usr/pgsql-9.3
如果不加--with-pgsql=/usr/pgsql-9.3,可能會出現configure: error: libpq is not installed or libpq is old錯誤
[root@pgpool pgpool-II-3.4.2]# make [root@pgpool pgpool-II-3.4.2]# make install
3.修改pgpool配置
啟用並修改配置文件pgpool.conf
[root@pgpool etc]# cp /usr/local/etc/pgpool.conf.sample-stream /usr/local/etc/pgpool.conf
修改pgpool.conf文件中的內容
# - pgpool Communication Manager Connection Settings - listen_addresses = '*' ---默認0是主庫,其它是從庫,backend_weight可以控制數據庫讀在兩台機器上的分配比例 # - Backend Connection Settings - backend_hostname0 = '172.16.0.241' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/var/lib/pgsql/9.3/data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = '172.16.0.242' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/data1' backend_flag1 = 'ALLOW_TO_FAILOVER' backend_data_directory0 = '/var/lib/pgsql/9.3/data' # - Authentication - ---開啟pgpool的hba認證 enable_pool_hba = on #------------------------------------------------------------------------------ # MASTER/SLAVE MODE #------------------------------------------------------------------------------ sr_check_user = 'postgres' sr_check_password = '123456' #------------------------------------------------------------------------------ # HEALTH CHECK #------------------------------------------------------------------------------ health_check_period = 1 health_check_user = 'postgres' health_check_password = '123456' #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ ---用來在主庫失敗時,把只讀的從庫切為主庫 failover_command = '/usr/local/bin/failover_stream.sh %d %H /tmp/trigger_file0'
啟用並修改配置文件pcp.conf
[root@pgpool etc]# cp /usr/local/etc/pcp.conf.sample /usr/local/etc/pcp.conf
此文件是管理端口,可以暫時不用配置。
啟用並修改配置文件pool_hba.conf
[root@pgpool etc]# cp /usr/local/etc/pool_hba.conf.sample /usr/local/etc/pool_hba.conf ---添加一行: host all all 172.16.0.0/24 md5 ---刪除一行 host all all 127.0.0.1/32 trust
對pgpool的訪問策略,要設置為md5的方式。
啟用配置文件pool_passwd
[root@pgpool etc]# pg_md5 -m -p -u postgres pool_passwd password:
密碼填的是123456,該文件會在執行文件后自動生成。從遠程連接pgpool的時候,需要使用該密碼來訪問,pgpool使用該用戶密碼來訪問后面的真正的數據庫。
添加主備腳本切換腳本failover_stream.sh
[root@pgpool bin]# vi /usr/local/bin/failover_stream.sh # Failover command for streaming replication. # This script assumes that DB node 0 is primary, and 1 is standby. # # If standby goes down, do nothing. If primary goes down, create a # trigger file so that standby takes over primary node. # # Arguments: $1: failed node id. $2: new master hostname. $3: path to # trigger file. failed_node=$1 new_master=$2 trigger_file=$3 # Do nothing if standby goes down. if [ $failed_node = 1 ]; then exit 0; fi # Create the trigger file. /usr/bin/ssh -T $new_master /bin/touch $trigger_file exit 0; [root@pgpool bin]# chmod 700 failover_stream.sh
對應的從庫slave的recovery.conf配置:
[root@slave data]# cat recovery.conf standby_mode = 'on' primary_conninfo = 'host=172.16.0.241 port=5432 user=rep_user' trigger_file = '/tmp/trigger_file0'
該trigger_file的參數要和pgpool.conf文件中failover_command對應的文件保持一致。
4.設置主機互信
修改hosts文件
對每個主機的/etc/hosts文件添加以下內容,
172.16.0.240 pgpool 172.16.0.241 master 172.16.0.242 slave
pgpool主機生成公鑰和私鑰
[root@pgpool bin]# su - postgres -bash-4.1$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa): Created directory '/var/lib/pgsql/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa. Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub. The key fingerprint is: a0:93:d4:b5:ed:26:d0:94:a5:e7:99:95:6b:d6:18:af postgres@pgpool The key's randomart image is: +--[ RSA 2048]----+ | oo. | | . +.+ . | | . + + o + | | . o o + + * | | + S * = o | | . o o . | | E | | | | | +-----------------+
輸入命令后一直回車即可。在master和slave節點同樣執行上面的操作。
同步公鑰
在pgpool節點上執行
-bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.241 -bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.242
在master節點上執行
-bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.241 -bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.242
在slave節點上執行
-bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.241 -bash-4.1$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@172.16.0.242
上面的操作執行后,在每個機器檢查一下,是否可以用ssh username@nodename的方式,對其他節點進行免密碼的ssh訪問,第一次訪問的時候會記錄known_hosts,否則ssh時候會有是否信任節點的提示。
5.啟動pgpool
使用postgres用戶啟動pgpool服務的話,需要修改對應目錄的權限,否則pgpool會啟動失敗
[root@pgpool etc]# mkdir /var/run/pgpool [root@pgpool run]# chown -R postgres:postgres /var/run/pgpool [root@pgpool local]# chown -R postgres:postgres /usr/local/etc [root@pgpool local]# chown -R postgres:postgres /usr/local/bin
檢驗主從數據庫切換的腳本是否可以正常運行,在pgpool服務器執行命令,檢測在slave服務器是否生成了trigger_file0文件
shell腳本在pgpool執行
-bash-4.1$ /usr/local/bin/failover_stream.sh 0 slave /tmp/trigger_file0
切換到slave主機檢查
[root@slave ~]# ls /tmp ssh-rangWC1783 trigger_file0 yum.log
啟動pgpool
-bash-4.1$ pgpool -n -d > /tmp/pgpool.log 2>&1 & [1] 10474
6.檢測pgpool功能
任一節點(如slave節點)連接pgpool
[root@slave ~]# psql -h 172.16.0.240 -p 9999 -d test -U postgres Password for user postgres: psql (9.3.6) Type "help" for help. test=# select * from t4; id ---- 1 (1 row) test=# show pool_nodes; node_id | hostname | port | status | lb_weight | role ---------+--------------+------+--------+-----------+--------- 0 | 172.16.0.241 | 5432 | 2 | 0.500000 | primary 1 | 172.16.0.242 | 5432 | 2 | 0.500000 | standby (2 rows)
可以看到此時241數據庫是主節點,242是從節點。
關閉主要節點,可以關閉服務也可以直接殺進程模擬數據庫崩潰
[root@master data]# service postgresql-9.3 stop Stopping postgresql-9.3 service: [ OK ] ---master關閉后,從節點的連接中斷后又成功連接上 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. test=# show pool_nodes; node_id | hostname | port | status | lb_weight | role ---------+--------------+------+--------+-----------+--------- 0 | 172.16.0.241 | 5432 | 3 | 0.500000 | standby 1 | 172.16.0.242 | 5432 | 2 | 0.500000 | primary test=# insert into t4 values (6); INSERT 0 1 test=# select * from t4; id ---- 1 6 (2 rows) test=#
pgpool發生主從切換后,242節點變為主節點,241節點關閉(status為3)。
上面查詢show pool_nodes中得status字段含義
Status 由數字 [0 - 3]來表示。 0 - 該狀態僅僅用於初始化,PCP從不顯示它。 1 - 節點已啟動,還沒有連接。 2 - 節點已啟動,連接被緩沖。 3 - 節點已關閉。
可以看到主備已經切換完成。
7.其它注意事項
- 若pgpool上的主從切換腳本忘記寫,或者沒能正常執行,show pool_nodes命令會顯示兩個節點都是standby,而集群整體是只讀的,此時可以關閉並啟動節點的數據庫服務,然后重啟pgpool。
- 測試trigger文件的時候,注意要及時刪除生成的trigger 文件,否則會破壞主從架構,導致后面的實驗失敗。