使用Patroni和HAProxy創建高可用的PostgreSQL集群


 操作系統:CentOS Linux release 7.6.1810 (Core)

node1:192.168.216.130 master

node2:192.168.216.132 slave

node3:192.168.216.136 haproxy

這里僅測試,所以只部署了一主一叢,適用與測試環境,生產環境建議postgres至少1主2從,3個etcd節點,2個haproxy+keepalive組成

一、首先在兩個節點上安裝postgres,下面以postgres9.5.19為例

1、添加RPM
yum install https://download.postgresql.org/pub/repos/yum/9.5/redhat/rhel-7-x86_64/pgdg-centos95-9.5-3.noarch.rpm
2、安裝PostgreSQL 9.5
yum install postgresql95-server postgresql95-contrib
注意:本次實驗我們這里只需要操作到第2步即可,初始化可以由patroni來替我們完成
3、初始化數據庫
/usr/pgsql-9.5/bin/postgresql95-setup initdb
4、設置開機自啟動
systemctl enable postgresql-9.5.service
5、啟動服務
systemctl start postgresql-9.5.service
6、查看版本
psql --version

二、安裝etcd服務

1、這里我只在node1單節點上安裝,僅實驗,未做分布式部署,如果集群部署可以參考博客etcd集群部署文章

yum install etcd -y
cp /etc/etcd/etcd.conf /etc/etcd/etcd.conf.bak
cd /etc/etcd/
[root@localhost etcd]# egrep ^[A-Z] ./etcd.conf
ETCD_DATA_DIR="/var/lib/etcd/node1.etcd"
ETCD_LISTEN_PEER_URLS="http://192.168.216.130:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.216.130:2379,http://127.0.0.1:2379"
ETCD_NAME="node1"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.216.130:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.216.130:2379"
ETCD_INITIAL_CLUSTER="node1=http://192.168.216.130:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

2、保存文件,然后重啟etcd服務

systemctl restart etcd

3、查看ectd服務是否正常

 三、安裝patroni,分別在node1和node2節點安裝

1、安裝patroni用到依賴包,這里通過pip安裝patroni

yum install gcc
yum install python-devel.x86_64
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py
pip install psycopg2-binary
pip install --upgrade setuptools
pip install patroni[etcd,consul]

2、驗證patroni是否安裝成功

 3、配置patroni,以下操作在node1中進行

mkdir /data/patroni/conf -p
cd /data/patroni/conf
yum install git
git clone https://github.com/zalando/patroni.git
cd /data/patroni/conf/patroni-master
cp -r postgres0.yml ../conf/

 4、編輯node1上的postgres0.yml文件

scope: batman
#namespace: /service/
name: postgresql0

restapi:
  listen: 192.168.216.130:8008
  connect_address: 192.168.216.130:8008
#  certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#  keyfile: /etc/ssl/private/ssl-cert-snakeoil.key
#  authentication:
#    username: username
#    password: password

# ctl:
#   insecure: false # Allow connections to SSL sites without certs
#   certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#   cacert: /etc/ssl/certs/ssl-cacert-snakeoil.pem

etcd:
  host: 192.168.216.130:2379

bootstrap:
  # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
  # and all other cluster members will use it as a `global configuration`
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
#    master_start_timeout: 300
    synchronous_mode: false
    #standby_cluster:
      #host: 127.0.0.1
      #port: 1111
      #primary_slot_name: patroni
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
         wal_level: logical
         hot_standby: "on"
         wal_keep_segments: 1000
         max_wal_senders: 10
         max_replication_slots: 10
         wal_log_hints: "on"
         archive_mode: "on"
         archive_timeout: 1800s
         archive_command: mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f
      recovery_conf:
         restore_command: cp ../wal_archive/%f %p

  # some desired options for 'initdb'
  initdb:  # Note: It needs to be a list (some options need values, others are switches)
  - encoding: UTF8
  - data-checksums

  pg_hba:  # Add following lines to pg_hba.conf after running 'initdb'
  # For kerberos gss based connectivity (discard @.*$)
  #- host replication replicator 127.0.0.1/32 gss include_realm=0
  #- host all all 0.0.0.0/0 gss include_realm=0
  - host replication replicator 0.0.0.0/0 md5
  - host all admin 0.0.0.0/0 md5
  - host all all 0.0.0.0/0 md5

  # Additional script to be launched after initial cluster creation (will be passed the connection URL as parameter)
# post_init: /usr/local/bin/setup_cluster.sh

  # Some additional users users which needs to be created after initializing new cluster
  users:
    admin:
      password: postgres
      options:
        - createrole
        - createdb
    replicator:
      password: replicator
      options:
        - replication
postgresql:
  listen: 0.0.0.0:5432
  connect_address: 192.168.216.130:5432
  data_dir: /data/postgres
  bin_dir: /usr/pgsql-9.5/bin/
#  config_dir:
#  pgpass: /tmp/pgpass0
  authentication:
    replication:
      username: replicator
      password: replicator
    superuser:
      username: admin
      password: postgres
#    rewind:  # Has no effect on postgres 10 and lower
#      username: rewind_user
#      password: rewind_password
  # Server side kerberos spn
#  krbsrvname: postgres
  parameters:
    # Fully qualified kerberos ticket file for the running user
    # same as KRB5CCNAME used by the GSS
#   krb_server_keyfile: /var/spool/keytabs/postgres
    unix_socket_directories: '.'

#watchdog:
#  mode: automatic # Allowed values: off, automatic, required
#  device: /dev/watchdog
#  safety_margin: 5

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

5、配置patroni,以下操作在node2中進行

mkdir /data/patroni/conf -p
cd /data/patroni/conf
yum install git
git clone https://github.com/zalando/patroni.git
cd /data/patroni/conf/patroni-master
cp -r postgres1.yml ../conf/

6、編輯node2上的postgres1.yml文件

scope: batman
#namespace: /service/
name: postgresql1

restapi:
  listen: 192.168.216.132:8008
  connect_address: 192.168.216.132:8008
#  certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#  keyfile: /etc/ssl/private/ssl-cert-snakeoil.key
#  authentication:
#    username: username
#    password: password

# ctl:
#   insecure: false # Allow connections to SSL sites without certs
#   certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
#   cacert: /etc/ssl/certs/ssl-cacert-snakeoil.pem

etcd:
  host: 192.168.216.130:2379

bootstrap:
  # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
  # and all other cluster members will use it as a `global configuration`
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
#    master_start_timeout: 300
    synchronous_mode: false
    #standby_cluster:
      #host: 127.0.0.1
      #port: 1111
      #primary_slot_name: patroni
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
         wal_level: logical
         hot_standby: "on"
         wal_keep_segments: 1000
         max_wal_senders: 10
         max_replication_slots: 10
         wal_log_hints: "on"
         archive_mode: "on"
         archive_timeout: 1800s
         archive_command: mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f
      recovery_conf:
         restore_command: cp ../wal_archive/%f %p

  # some desired options for 'initdb'
  initdb:  # Note: It needs to be a list (some options need values, others are switches)
  - encoding: UTF8
  - data-checksums

  pg_hba:  # Add following lines to pg_hba.conf after running 'initdb'
  # For kerberos gss based connectivity (discard @.*$)
  #- host replication replicator 127.0.0.1/32 gss include_realm=0
  #- host all all 0.0.0.0/0 gss include_realm=0
  - host replication replicator 0.0.0.0/0 md5
  - host all admin 0.0.0.0/0 md5
  - host all all 0.0.0.0/0 md5

  # Additional script to be launched after initial cluster creation (will be passed the connection URL as parameter)
# post_init: /usr/local/bin/setup_cluster.sh

  # Some additional users users which needs to be created after initializing new cluster
  users:
    admin:
      password: postgres
      options:
        - createrole
        - createdb
    replicator:
      password: replicator
      options:
        - replication
postgresql:
  listen: 0.0.0.0:5432
  connect_address: 192.168.216.132:5432
  data_dir: /data/postgres
  bin_dir: /usr/pgsql-9.5/bin/
#  config_dir:
#  pgpass: /tmp/pgpass0
  authentication:
    replication:
      username: replicator
      password: replicator
    superuser:
      username: admin
      password: postgres
#    rewind:  # Has no effect on postgres 10 and lower
#      username: rewind_user
#      password: rewind_password
  # Server side kerberos spn
#  krbsrvname: postgres
  parameters:
    # Fully qualified kerberos ticket file for the running user
    # same as KRB5CCNAME used by the GSS
#   krb_server_keyfile: /var/spool/keytabs/postgres
    unix_socket_directories: '.'

#watchdog:
#  mode: automatic # Allowed values: off, automatic, required
#  device: /dev/watchdog
#  safety_margin: 5

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

7、記下data_dir上述yml配置文件中的值。該目錄需要確保postgres用戶具備寫入的權限。如果此目錄不存在,請創建它:在node1和node2節點分別進行如下操作

mkdir /data/postgres -p
chown -Rf postgres:postgres /data/postgres 
chmod 700 /data/postgres

8、在node1上切換到postgres用戶,並啟動patroni服務,這里patroni會幫我們自動初始化數據庫並創建相應的角色

chown -Rf postgres:postgres /data/patroni/conf
su - postgres
啟動patroni服務
patroni /data/patroni/conf/postgres0.yml

 此時如果服務正常啟動可以打印以下日志信息

 由於不是后台啟動的服務,所以這里我們克隆一個窗口,切換到postgres用戶下,並執行psql -h 127.0.0.1 -U admin postgres連接數據庫,驗證patroni是否正常托管postgres服務

9、在node2上切換到postgres用戶,並啟動patroni服務,這里和node1的操作一致

chown -Rf postgres:postgres /data/patroni/conf
su - postgres
啟動patroni服務
patroni /data/patroni/conf/postgres1.yml

 如果服務啟動正常,可看到如下日志打印信息

 

 10、查詢集群運行狀態patronictl -c /data/patroni/conf/postgres0.yml list

 11、patronictl -c /data/patroni/conf/postgres0.yml switchover 手動切換master

 12、可以后台啟動來保持patroni服務不中斷,也可以配置成systemd服務來管理保證開機自啟

node1節點:

nohup patroni /data/patroni/conf/postgres0.yml >
/data/patroni/patroni_log 2>&1 &

node2節點:

nohup patroni /data/patroni/conf/postgres1.yml >
/data/patroni/patroni_log 2>&1 &

四、在node3節點安裝haproxy

yum install -y haproxy
cp -r /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg_bak

  編輯haproxy.cfg配置文件

# vi /etc/haproxy/haproxy.cfg

#---------------------------------------------------------------------
# 全局定義
global
    # log語法:log [max_level_1] 
    # 全局的日志配置,使用log關鍵字,指定使用127.0.0.1上的syslog服務中的local0日志設備,
    # 記錄日志等級為info的日志 
#   log         127.0.0.1 local0 info
    log         127.0.0.1 local1 notice
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    
    # 定義每個haproxy進程的最大連接數 ,由於每個連接包括一個客戶端和一個服務器端,
    # 所以單個進程的TCP會話最大數目將是該值的兩倍。 
    maxconn     4096
    
    # 用戶,組
    user        haproxy
    group       haproxy
	
    # 以守護進程的方式運行 
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# 默認部分的定義
defaults     
    # mode語法:mode {http|tcp|health} 。http是七層模式,tcp是四層模式,health是健康檢測,返回OK
    mode tcp    
    # 使用127.0.0.1上的syslog服務的local3設備記錄錯誤信息
    log 127.0.0.1 local3 err

    #if you set mode to http,then you nust change tcplog into httplog 
    option     tcplog 
    
    # 啟用該項,日志中將不會記錄空連接。所謂空連接就是在上游的負載均衡器或者監控系統為了
    #探測該服務是否存活可用時,需要定期的連接或者獲取某一固定的組件或頁面,或者探測掃描
    #端口是否在監聽或開放等動作被稱為空連接;官方文檔中標注,如果該服務上游沒有其他的負
    #載均衡器的話,建議不要使用該參數,因為互聯網上的惡意掃描或其他動作就不會被記錄下來
    option     dontlognull 
	
    # 定義連接后端服務器的失敗重連次數,連接失敗次數超過此值后將會將對應后端服務器標記為不可用       
    retries    3 
    
    # 當使用了cookie時,haproxy將會將其請求的后端服務器的serverID插入到cookie中,以保證
    #會話的SESSION持久性;而此時,如果后端的服務器宕掉了,但是客戶端的cookie是不會刷新的
    #,如果設置此參數,將會將客戶的請求強制定向到另外一個后端server上,以保證服務的正常
    option redispatch

    #等待最大時長  When a server's maxconn is reached, connections are left pending in a queue  which may be server-specific or global to the backend. 
    timeout queue           1m
	
    # 設置成功連接到一台服務器的最長等待時間,默認單位是毫秒
    timeout connect         10s
	
    # 客戶端非活動狀態的超時時長   The inactivity timeout applies when the client is expected to acknowledge or  send data.
    timeout client          1m
	
    # Set the maximum inactivity time on the server side.The inactivity timeout applies when the server is expected to acknowledge or  send data. 
    timeout server          1m
    timeout check           5s
    maxconn                 5120	

#---------------------------------------------------------------------
# 配置haproxy web監控,查看統計信息 
listen status 
    bind 0.0.0.0:1080    
    mode http    
    log global
	
    stats enable
    # stats是haproxy的一個統計頁面的套接字,該參數設置統計頁面的刷新間隔為30s
    stats refresh 30s    
    stats uri /haproxy-stats
    # 設置統計頁面認證時的提示內容
    stats realm Private lands
    # 設置統計頁面認證的用戶和密碼,如果要設置多個,另起一行寫入即可
    stats auth admin:passw0rd
    # 隱藏統計頁面上的haproxy版本信息
#    stats hide-version
	
#---------------------------------------------------------------------
listen master
	bind *:5000
        mode tcp
        option tcplog
        balance roundrobin
	option httpchk OPTIONS /master
	http-check expect status 200
	default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
        server node1 192.168.216.130:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2
        server node2 192.168.216.132:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2
listen replicas
	bind *:5001
        mode tcp
        option tcplog
        balance roundrobin
	option httpchk OPTIONS /replica
	http-check expect status 200
	default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
        server node1 192.168.216.130:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2
        server node2 192.168.216.132:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

 啟動haproxy服務

systemctl start haproxy
systemctl status haproxy

瀏覽器訪問http://192.168.216.136:1080/haproxy-stats輸入用戶名admin密碼passw0rd

這里我們通過5000端口和5001端口分別來提供寫服務和讀服務,如果需要對數據庫寫入數據只需要對外提供192.168.216.136:5000即可,可以模擬主庫故障,即關閉其中的master節點來驗證是否會進行自動主從切換

https://www.linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/#configure-etcd

https://www.opsdash.com/blog/postgres-getting-started-patroni.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM