PostgreSQL恢復誤刪數據


  在Oracle中;刪除表或者誤刪表記錄;有個閃回特性,不需要停機操作,可以完美找回記錄。當然也有一些其他的恢復工具;例如odu工具,gdul工具。都可以找回數據。而PostgreSQL目前沒有閃回特性。如何在不停機情況下恢復誤刪數據。還好是有完整的熱備份。

  本文描述的方法是:利用熱備份在另一台服務器進行數據恢復;再導入正式環境;這樣不影響數據庫操作。這方法也適用在Oracle恢復。必須滿足幾個條件

  1. 有完整的基礎數據文件備份和歸檔文件備份.所以備份是很重要的。
  2. 有一台裝好同款Postgres軟件的服務器

實例模擬講解

  過程模擬誤刪表tbl_lottu_drop后;后續進行dml/ddl操作;表明正式數據庫還是進行正常工作。在另外一台數據庫基於數據庫PITR恢復。恢復表tbl_lottu_drop的數據。

  • Postgres201 : 線上數據庫服務器
  • Postgres202 : 操作服務器

1. 創建一個有效的備份

postgres=# select pg_start_backup(now()::text); 
 pg_start_backup 
-----------------
 0/F000060
(1 row)

[postgres@Postgres201 ~]$ rsync -acvz -L --exclude "pg_xlog" --exclude "pg_log" $PGDATA /data/backup/20180428

postgres=# select pg_stop_backup(); 
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup 
----------------
 0/F000168
(1 row)

2. 模擬誤操作

  2.1 創建一個需要恢復對象表tbl_lottu_drop。並插入1000記錄。也保證數據從數據緩存寫入磁盤中。

lottu=> create table tbl_lottu_drop (id int);
CREATE TABLE
lottu=> insert into tbl_lottu_drop select generate_series(1,1000);  
INSERT 0 1000
lottu=> \c lottu postgres
You are now connected to database "lottu" as user "postgres".

  2.2 這個獲取一個時間:用於后面基於數據庫PITR恢復(當然現實操作后只能記住一個大概的時間;還往往是不准;可能記住的時間是誤操作之后。后面有講解如何獲取需要恢復到那個時間點)

lottu=# select now();
              now              
-------------------------------
 2018-04-28 20:47:31.617808+08
(1 row)
lottu=# checkpoint;
CHECKPOINT
lottu=# select pg_xlogfile_name(pg_switch_xlog());
     pg_xlogfile_name     
--------------------------
 000000030000000000000010
(1 row)

  2.3 進行drop表

lottu=# drop table tbl_lottu_drop;
DROP TABLE

  2.4 后續進行dml/ddl操作;表明正式數據庫還是進行正常工作

lottu=# create table tbl_lottu_log (id int);
CREATE TABLE
lottu=# insert into  tbl_lottu_log values (1),(2);
INSERT 0 2
lottu=# checkpoint;
CHECKPOINT
lottu=# select pg_xlogfile_name(pg_switch_xlog());
     pg_xlogfile_name     
--------------------------
 000000030000000000000011
(1 row)

3. 恢復操作

  3.1 將備份拷貝到Postgres202數據庫上

[postgres@Postgres201 20180428]$ cd /data/backup/20180428
[postgres@Postgres201 20180428]$ ll
total 4
drwx------. 18 postgres postgres 4096 Apr 28 20:42 data
[postgres@Postgres201 20180428]$ rsync -acvz -L data postgres@192.168.1.202:/data/postgres

  3.2 刪除不必要的文件

[postgres@Postgres202 data]$ cd $PGDATA
[postgres@Postgres202 data]$ rm backup_label.old postmaster.pid tablespace_map.old

  3.3 還原備份表空間軟鏈接

[postgres@Postgres202 data]$ cat tablespace_map 
16385 /data/pg_data/lottu
[postgres@Postgres202 data]$ mkdir -p /data/pg_data
[postgres@Postgres202 data]$ cd pg_tblspc/
[postgres@Postgres202 pg_tblspc]$ mv 16385/  /data/pg_data/lottu
[postgres@Postgres202 pg_tblspc]$ ln -s /data/pg_data/lottu ./16385
[postgres@Postgres202 pg_tblspc]$ ll
total 0
lrwxrwxrwx. 1 postgres postgres 19 Apr 28 23:12 16385 -> /data/pg_data/lottu

  3.4 將wal日志拷貝到Postgres202數據庫上pg_xlog目錄下;從哪個日志開始拷貝?

[postgres@Postgres202 data]$ mkdir -p pg_xlog/archive_status
[postgres@Postgres202 data]$ cat backup_label 
START WAL LOCATION: 0/F000060 (file 00000003000000000000000F)
CHECKPOINT LOCATION: 0/F000098
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2018-04-28 20:42:15 CST
LABEL: 2018-04-28 20:42:13.244358+08

  查看backup_label;知道00000003000000000000000F開始到正在寫入的wal日志。

[postgres@Postgres202 pg_xlog]$ ll
total 65540
-rw-------. 1 postgres postgres 16777216 Apr 28 20:42 00000003000000000000000F
-rw-------. 1 postgres postgres      313 Apr 28 20:42 00000003000000000000000F.00000060.backup
-rw-------. 1 postgres postgres 16777216 Apr 28 20:48 000000030000000000000010
-rw-------. 1 postgres postgres 16777216 Apr 28 20:50 000000030000000000000011
-rw-------. 1 postgres postgres 16777216 Apr 28 20:55 000000030000000000000012

  3.5 編輯recovery.conf文件

[postgres@Postgres202 data]$ vi recovery.conf 

restore_command = 'cp /data/arch/%f %p'            # e.g. 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = '2018-04-28 20:47:31.617808+08'
recovery_target_inclusive = false
recovery_target_timeline = 'latest'

  3.6 啟動數據庫;並驗證數據

[postgres@Postgres202 data]$ pg_start
server starting
[postgres@Postgres202 data]$ ps -ef | grep postgres
root      1098  1083  0 22:32 pts/0    00:00:00 su - postgres
postgres  1099  1098  0 22:32 pts/0    00:00:00 -bash
root      1210  1195  0 22:55 pts/1    00:00:00 su - postgres
postgres  1211  1210  0 22:55 pts/1    00:00:00 -bash
postgres  1442     1  1 23:16 pts/0    00:00:00 /opt/pgsql96/bin/postgres
postgres  1450  1442  0 23:16 ?        00:00:00 postgres: checkpointer process   
postgres  1451  1442  0 23:16 ?        00:00:00 postgres: writer process   
postgres  1459  1442  0 23:16 ?        00:00:00 postgres: wal writer process   
postgres  1460  1442  0 23:16 ?        00:00:00 postgres: autovacuum launcher process   
postgres  1461  1442  0 23:16 ?        00:00:00 postgres: archiver process   last was 00000005.history
postgres  1462  1442  0 23:16 ?        00:00:00 postgres: stats collector process   
postgres  1464  1099  0 23:16 pts/0    00:00:00 ps -ef
postgres  1465  1099  0 23:16 pts/0    00:00:00 grep postgres
[postgres@Postgres202 data]$ psql
psql (9.6.0)
Type "help" for help.

postgres=# \c lottu lottu
You are now connected to database "lottu" as user "lottu".
lottu=> \dt
            List of relations
 Schema |      Name      | Type  | Owner 
--------+----------------+-------+-------
 public | pitr_test      | table | lottu
 public | tbl_lottu_drop | table | lottu
 
 lottu=> select count(1) from tbl_lottu_drop;
 count 
-------
  1000
(1 row)

  從這看數據是恢復了;copy到線上數據庫操作略。

延伸點

下面講解下如何找到誤操作的時間。即recovery_target_time = '2018-04-28 20:47:31.617808+08'的時間點。上文是前面已經獲取的;

  1. 用pg_xlogdump解析這段日志。

[postgres@Postgres201 pg_xlog]$ pg_xlogdump -b 00000003000000000000000F 000000030000000000000012 > lottu.log
pg_xlogdump: FATAL:  error in WAL record at 0/12000648: invalid record length at 0/12000680: wanted 24, got 0

  2. 從lottu.log中可以找到這段日志

rmgr: Transaction len (rec/tot):      8/    34, tx:       1689, lsn: 0/100244A0, prev 0/10024460, desc: COMMIT 2018-04-28 20:45:49.736013 CST
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/100244C8, prev 0/100244A0, desc: RUNNING_XACTS nextXid 1690 latestCompletedXid 1689 oldestRunningXid 1690
rmgr: Heap        len (rec/tot):      3/  3130, tx:       1690, lsn: 0/10024500, prev 0/100244C8, desc: INSERT off 9
    blkref #0: rel 16385/16386/2619 fork main blk 15 (FPW); hole: offset: 60, length: 5116

rmgr: Btree       len (rec/tot):      2/  7793, tx:       1690, lsn: 0/10025140, prev 0/10024500, desc: INSERT_LEAF off 385
    blkref #0: rel 16385/16386/2696 fork main blk 1 (FPW); hole: offset: 1564, length: 452

rmgr: Heap        len (rec/tot):      2/   184, tx:       1690, lsn: 0/10026FD0, prev 0/10025140, desc: INPLACE off 16
    blkref #0: rel 16385/16386/1259 fork main blk 0
rmgr: Transaction len (rec/tot):     88/   114, tx:       1690, lsn: 0/10027088, prev 0/10026FD0, desc: COMMIT 2018-04-28 20:46:37.718442 CST; inval msgs: catcache 49 catcache 45 catcache 44 relcache 32784
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027100, prev 0/10027088, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027138, prev 0/10027100, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 0/10027170, prev 0/10027138, desc: CHECKPOINT_ONLINE redo 0/10027138; tli 3; prev tli 3; fpw true; xid 0:1691; oid 40976; multi 1; offset 0; oldest xid 1668 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 1691; online
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/100271E0, prev 0/10027170, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027218, prev 0/100271E0, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 0/10027250, prev 0/10027218, desc: CHECKPOINT_ONLINE redo 0/10027218; tli 3; prev tli 3; fpw true; xid 0:1691; oid 40976; multi 1; offset 0; oldest xid 1668 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 1691; online
rmgr: XLOG        len (rec/tot):      0/    24, tx:          0, lsn: 0/100272C0, prev 0/10027250, desc: SWITCH 
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/11000028, prev 0/100272C0, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
rmgr: Standby     len (rec/tot):     16/    42, tx:       1691, lsn: 0/11000060, prev 0/11000028, desc: LOCK xid 1691 db 16386 rel 32784 
rmgr: Heap        len (rec/tot):      8/  2963, tx:       1691, lsn: 0/11000090, prev 0/11000060, desc: DELETE off 16 KEYS_UPDATED 
    blkref #0: rel 16385/16386/1247 fork main blk 8 (FPW); hole: offset: 88, length: 5288

根據“32784”日志可以看到是表tbl_lottu_drop在2018-04-28 20:46:37.718442插入1000條記錄(所以恢復時間點選2018-04-28 20:47:31.617808+08沒毛病);即也是在事務id為1690操作的。並在事務id為1691進行刪除操作。

所以上面的recovery.conf 也可以改寫為:

restore_command = 'cp /data/arch/%f %p'            # e.g. 'cp /mnt/server/archivedir/%f %p'
recovery_target_xid = '1690' 
recovery_target_inclusive = false    
recovery_target_timeline = 'latest'

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM