PostgreSQL12同步流復制搭建-同步不生效的問題、主庫恢復后,無法與新主庫同步問題


PostgreSQL12的流復制配置不再放到recovery.conf文件中,但是基本配置還是一樣的,過年了也沒心情工作,就來搭一下試試。

 

官方文檔:

https://www.postgresql.org/docs/12/runtime-config-replication.html

 

開始:

1)下載安裝包:

https://www.postgresql.org/docs/12/runtime-config-replication.html

2)解壓並安裝

tar xzvf postgresql-12.1.tar.gz
cd postgresql-12.1/
./configure --prefix=/opt/pg12 --without-zlib
su root -c 'chown -R postgres:postgres /opt/pg12'
make && make install

3)創建目錄和環境變量,准備就在本機上創建兩個data目錄進行試驗:data1為主 data2為備

cd /opt/pg12
mkdir data1
mkdir data2

vim ~/pg12.env 
source ~/pg12.env 
[postgres@localhost data1]$ cat ~/pg12.env 
export PGHOME=/opt/pg12/
export PATH=$PGHOME/bin:$PATH
export PGDATA=$PGHOME/data1
export PGPORT=54121

4)初始化數據庫

cd data1

vim postgresql.conf 
修改:
port = 54121 
wal_level = replica
synchronous_commit = on
max_wal_senders = 10
wal_keep_segments = 1024
synchronous_standby_names = 'standby_node'
hot_standby = on
hot_standby_feedback = on
logging_collector = on

啟動數據庫:
pg_ctl start

cd ../data2
pg_basebackup -R -X stream -Fp -D ./ -h localhost -p 54121
vim postgresql.conf
修改:
recovery_target_timeline = 'latest'
primary_conninfo = 'application_name=standby_node host=localhost port=54121 user=postgres password=postgres'                                   
promote_trigger_file = '/opt/pg12/data2/promote_trigger_file'
port=54122

啟動備數據庫:
pg_ctl -D ./ start

 

5)查看流復制情況,發現是異步流復制,application_name沒有生效:

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 19728
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | ::1
client_hostname  | 
client_port      | 37651
backend_start    | 2020-01-21 19:55:14.881115-08
backend_xmin     | 488
state            | streaming
sent_lsn         | 0/30175C0
write_lsn        | 0/30175C0
flush_lsn        | 0/30175C0
replay_lsn       | 0/30175C0
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2020-01-21 20:01:38.445309-08

 

進一步排查,發現是在postgresql.auto.conf中有自動生成的primary_conninfo配置,里面沒有application_name配置,而postgresql.auto.conf文件的優先級高於postgresql.conf文件。在里面添加節點名稱:

[postgres@localhost data2]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'application_name=standby_node user=postgres passfile=''/home/postgres/.pgpass'' host=localhost port=54121 sslmode=disable sslcompression=0 gssencmode=disable target_session_attrs=any'

6)在從節點上reload是不會生效的,必須重啟從節點:

pg_ctl -D ./ restart

 

7)在主節點查看流復制,同步生效:

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 20151
usesysid         | 10
usename          | postgres
application_name | standby_node
client_addr      | ::1
client_hostname  | 
client_port      | 37657
backend_start    | 2020-01-21 20:03:52.765047-08
backend_xmin     | 488
state            | streaming
sent_lsn         | 0/3017670
write_lsn        | 0/3017670
flush_lsn        | 0/3017670
replay_lsn       | 0/3017670
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 1
sync_state       | sync
reply_time       | 2020-01-21 20:04:02.884156-08

  

8)將主節點的synchronous_standby_names配置為any的方式,reload即可生效:

vim postgresql.conf
synchronous_standby_names = 'any 1 (standby_node,node1)'

pg_ctl reload

postgres=# select * from pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 20663 usesysid | 10 usename | postgres application_name | standby_node client_addr | ::1 client_hostname | client_port | 37659 backend_start | 2020-01-21 20:24:48.714207-08 backend_xmin | 490 state | streaming sent_lsn | 0/3022AD0 write_lsn | 0/3022AD0 flush_lsn | 0/3022AD0 replay_lsn | 0/3022AD0 write_lag | flush_lag | replay_lag | sync_priority | 1 sync_state | quorum reply_time | 2020-01-21 22:20:01.980318-08

  

說明:

sync_priority
Priority of this standby server for being chosen as the synchronous standby in a priority-based synchronous replication. This has no effect in a quorum-based synchronous replication. sync_state
Synchronous state of this standby server. Possible values are:      async: This standby server is asynchronous.      potential: This standby server is now asynchronous, but can potentially become synchronous if one of current synchronous ones fails.     sync: This standby server is synchronous.     quorum: This standby server is considered as a candidate for quorum standbys.

 

從節點上看接收日志的情況:

postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 23261
status                | streaming
receive_start_lsn     | 0/3000000
receive_start_tli     | 1
received_lsn          | 0/3024890
received_tli          | 1
last_msg_send_time    | 2020-01-21 22:32:16.786496-08
last_msg_receipt_time | 2020-01-21 22:32:16.786575-08
latest_end_lsn        | 0/3024890
latest_end_time       | 2020-01-21 22:29:16.556672-08
slot_name             | 
sender_host           | localhost
sender_port           | 54121
conninfo              | user=postgres passfile=/home/postgres/.pgpass dbname=replication host=localhost port=54121 application_name=standby_node fallback_application_name=walreceiver sslmode=disable sslcompression=0 gssencmode=disable target_session_attrs=any

  

 9)添加slot進行同步:

主節點創建slot:

postgres=# select pg_create_physical_replication_slot('node1');
 pg_create_physical_replication_slot 
-------------------------------------
 (node1,)
(1 row)

postgres=# select * from pg_replication_slots;
 slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn 
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
 node1     |        | physical  |        |          | f         | f      |            |      |              |             | 
(1 row)

  

從節點使用slot node1進行同步:

echo "primary_slot_name = 'node1' " >> postgresql.conf

pg_ctl -D ./ restart

postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 23468
status                | streaming
receive_start_lsn     | 0/3000000
receive_start_tli     | 1
received_lsn          | 0/6D6F5D0
received_tli          | 1
last_msg_send_time    | 2020-01-21 22:38:27.882638-08
last_msg_receipt_time | 2020-01-21 22:38:27.882767-08
latest_end_lsn        | 0/6D6F5D0
latest_end_time       | 2020-01-21 22:38:27.882638-08
slot_name             | node1
sender_host           | localhost
sender_port           | 54121
conninfo              | user=postgres passfile=/home/postgres/.pgpass dbname=replication host=localhost port=54121 application_name=standby_node fallback_application_name=walreceiver sslmode=disable sslcompression=0 gssencmode=disable target_session_attrs=any

  

主節點pg_replication_slots視圖有變化:

postgres=# select * from pg_replication_slots;
-[ RECORD 1 ]-------+----------
slot_name           | node1
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 23469
xmin                | 493
catalog_xmin        | 
restart_lsn         | 0/6D6F6B8
confirmed_flush_lsn | 

  

復制槽slot的使用在前面分析過(同步其實就沒有必要用復制槽了,異步需要讓復制槽保留日志文件),在這里就不多說了。

 

10)pg12在SQL中添加了函數pg_promote()來提升從庫,現在從庫有三種方式:

a. pg_ctl promote

b. 創建promote_trigger_file

c. 在調用函數pg_promote()

 

11)當停掉的從主要重新加入集群,做從節點,需要注意已經沒有standby_mode參數了,需要判斷data目錄下是否有standby.signal文件,下面是具體的說明:

19.5.4. Archive Recovery
This section describes the settings that apply only for the duration of the recovery. They must be reset for any subsequent recovery you wish to perform.

“Recovery” covers using the server as a standby or for executing a targeted recovery. Typically, standby mode would be used to provide high availability and/or read scalability, 
whereas a targeted recovery is used to recover from data loss. To start the server in standby mode, create a file called standby.signal in the data directory. The server will enter recovery and will not stop recovery when the end of archived WAL is reached,
but will keep trying to continue recovery by connecting to the sending server as specified by the primary_conninfo setting and/or by fetching new WAL segments using restore_command. For this mode,
the parameters from this section and Section 19.6.3 are of interest. Parameters from Section 19.5.5 will also be applied but are typically not useful in this mode. To start the server in targeted recovery mode, create a file called recovery.signal in the data directory. If both standby.signal and recovery.signal files are created, standby mode takes precedence.
Targeted recovery mode ends when the archived WAL is fully replayed, or when recovery_target is reached. In this mode, the parameters from both this section and Section 19.5.5 will be used.

  

12)recovery模式也做了改變,需要在data目錄創建recovery.signal文件,恢復的目標統一為:

recovery_target_time

recovery_target_xid

recovery_target_name

recovery_target_lsn

recovery_target

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM