Pt-Archiver 使用說明:
工具介紹:
Pt-Archiver 是一款數據庫單表數據導出,清理,歸檔工具
官方文檔地址:
https://www.percona.com/doc/percona-toolkit/LATEST/pt-archiver.html
常用配置說明:
-where 'id<3000' 設置操作條件
--limit 10000 每次取1000行數據給pt-archive處理
--txn-size 1000 設置1000行為一個事務提交一次
--progress 5000 每處理5000行輸出一次處理信息
--statistics 結束的時候給出統計信息:開始的時間點,結束的時間點,查詢的行數,歸檔的行數,刪除的行數,以及各個階段消耗的總的時間和比例,便於以此進行優化。只要不加上--quiet,默認情況下pt-archive都會輸出執行過程的
--charset=UTF8 指定字符集為UTF8,字符集需要對應當前庫的字符集來操作
--no-delete 表示不刪除原來的數據,注意:如果不指定此參數,所有處理完成后,都會清理原表中的數據
--bulk-delete 批量刪除source上的舊數據
--bulk-insert 批量插入數據到dest主機 (看dest的general log發現它是通過在dest主機上LOAD DATA LOCAL INFILE插入數據的)
--dry-run 模擬執行
--source 源數據
--dest 目標數據
注意事項:
--source --dest 后的DSN之間不能空格出現,否則會出錯。
--where條件的值,有字符串的,要用引號括起來。
DSN 是指 --source --dest 的參數
工作原理:
181214 10:55:55 36 Connect root@10.10.11.16 on platform-unitive
36 Query set autocommit=0
36 Query /*!40101 SET NAMES "UTF8"*/
36 Query SHOW VARIABLES LIKE 'wait\_timeout'
36 Query SET SESSION wait_timeout=10
36 Query SELECT @@SQL_MODE
36 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
36 Query SELECT version()
36 Query SHOW VARIABLES LIKE 'version%'
36 Query SHOW ENGINES
36 Query SHOW VARIABLES LIKE 'innodb_version'
36 Query show variables like 'innodb_rollback_on_timeout'
36 Query /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
36 Query USE `platform-unitive`
36 Query SHOW CREATE TABLE `platform-unitive`.`u_push_message`
36 Query /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
36 Query SELECT CONCAT(/*!40100 @@session.character_set_connection, */ "")
37 Connect root@10.10.11.16 on platform-unitive
37 Query set autocommit=0
37 Query /*!40101 SET NAMES "UTF8"*/
37 Query SHOW VARIABLES LIKE 'wait\_timeout'
37 Query SET SESSION wait_timeout=10
37 Query SELECT @@SQL_MODE
37 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
37 Query SELECT version()
37 Query SHOW VARIABLES LIKE 'version%'
37 Query SHOW ENGINES
37 Query SHOW VARIABLES LIKE 'innodb_version'
37 Query show variables like 'innodb_rollback_on_timeout'
37 Query /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
37 Query USE `platform-unitive`
37 Query SHOW CREATE TABLE `platform-unitive`.`u_push_message_2018`
37 Query /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
37 Query SELECT CONCAT(/*!40100 @@session.character_set_connection, */ "")
36 Query SHOW VARIABLES LIKE 'wsrep_on'
36 Query SHOW VARIABLES LIKE 'wsrep_on'
36 Query SHOW VARIABLES LIKE 'version%'
36 Query SHOW ENGINES
36 Query SHOW VARIABLES LIKE 'innodb_version'
36 Query SELECT MAX(`id`) FROM `platform-unitive`.`u_push_message`
36 Query SELECT CONCAT(@@hostname, @@port)
37 Query SELECT CONCAT(@@hostname, @@port)
36 Query SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (id <= 640040) AND (`id` < '23928907') ORDER BY `id` LIMIT 2
36 Query SELECT 'pt-archiver keepalive'
37 Query commit
36 Query commit
37 Query LOAD DATA LOCAL INFILE '/tmp/3xbPFCYChYpt-archiver' INTO TABLE `platform-unitive`.`u_push_message_2018`CHARACTER SET UTF8(`id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull`)
36 Query SELECT 'pt-archiver keepalive'
36 Query DELETE FROM `platform-unitive`.`u_push_message` WHERE (((`id` >= '640021'))) AND (((`id` <= '640022'))) AND (id <= 640040) LIMIT 2
37 Query commit
36 Query commit
使用場景一:
u_push_message 表按照推送時間90天前數據 歸檔到 u_push_message_2018
1,創建u_push_message_2018 表結構與u_push_message 完全一致,不需要額外索引
2,編寫執行腳本
pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message_2018 --charset=UTF8 --where 'push_time <= 1536854400000' --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --statistics --purge
3,增加--dry-run 去服務器檢查執行
pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message_2018 --charset=UTF8 --where 'push_time <= 1536854400000' --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --statistics --purge --dry-run
4.檢查輸出日志是否符合預期:
SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (push_time <= 1536854400000) AND (`id` < '23928907') ORDER BY `id` LIMIT 10000
SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (push_time <= 1536854400000) AND (`id` < '23928907') AND ((`id` >= ?)) ORDER BY `id` LIMIT 10000
DELETE FROM `platform-unitive`.`u_push_message` WHERE (((`id` >= ?))) AND (((`id` <= ?))) AND (push_time <= 1536854400000) LIMIT 10000
LOAD DATA LOCAL INFILE ? INTO TABLE `platform-unitive`.`u_push_message_2018`CHARACTER SET UTF8(`id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull`)
5,編寫執行計划腳本每天3點執行: 替換 --where 'push_time <= 1536854400000' 中的push_time時間
Shell 腳本示例:
#/bin/bash
#時間變量
month=`date -d "10 day ago" "+%Y-%m-%d 00:00:00"`
lock_time=`date -d "60 day ago" "+%Y-%m-%d"`
timeStamp=`date -d "$month" +%s`
currentTimeStamp=$((timeStamp*1000))
#數據庫變量
source_data=platform-unitive
source_table=u_push_message
out_data=platform-unitive
out_table=u_push_message_2018
out_dir=/tmp/"$out_data"
out_time=`date "+%Y_%m_%d"`
out_file="$out_dir"/"$out_table"_archiver_log_"$out_time"
#判斷文件夾是否存在,不存在就創建
if [ ! -d "$out_dir" ]; then
/bin/mkdir "$out_dir" -p
fi
#數據歸檔,判斷進程是否存在,不存在就開始,存在就不開始.
count=`ps -ef|grep "/usr/bin/pt-archiver"|grep "t=$out_table,"|grep -v grep|wc -l`
echo $count
if [ "$count" -eq 0 ]
then
echo "start process....."
/usr/bin/pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=$source_data,t=$source_table --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',t=$out_table,D=$out_data --charset=UTF8 --where "push_time <= '$currentTimeStamp'" --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --purge --statistics >>"$out_file" 2>&1 &
else
echo "runing....."
fi
