zabbix的監控主機數量將近300,且運行了一年時間了,最近zabbix server服務監控歷史數據等服務不斷自身告警、查詢性能也變得很低
關於歷史數據的兩個參數,在zabbix server的配置文件中
可以選擇關閉housekeeper禁止自動定期清除歷史記錄數據,因為對於大數據的刪除會直接影響zabbix的性能、或者調整相應參數
HousekeepingFrequency
取值范圍:0-24
默認值:1
說明:housekeep執行頻率,默認每小時回去刪除一些過期數據。如果server重啟,那么30分鍾之后才執行一次,接下來,每隔一小時在執行一次。
MaxHousekeeperDelete
取值范圍: 0-1000000
默認值:5000
housekeeping一次刪除的數據不能大於MaxHousekeeperDelete
數據庫優化
一、設置獨立表空間(innodb_file_per_table=1) # 5.6版本以上自動開啟 以上版本跳過這一段
1、清空history數據
[root@Zabbix-Server ~]# mysql -u zabbix -p MariaDB [(none)]> use zabbix; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [zabbix]> MariaDB [zabbix]> truncate table history; Query OK, 0 rows affected (0.19 sec) MariaDB [zabbix]> optimize table history; +----------------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +----------------+----------+----------+-------------------------------------------------------------------+ | zabbix.history | optimize | note | Table does not support optimize, doing recreate + analyze instead | | zabbix.history | optimize | status | OK | +----------------+----------+----------+-------------------------------------------------------------------+ 2 rows in set (0.81 sec) MariaDB [zabbix]> truncate table history_str; Query OK, 0 rows affected (0.05 sec) MariaDB [zabbix]> truncate table history_uint; Query OK, 0 rows affected (6.32 sec)
2、修改表結構
MariaDB [(none)]> use zabbix; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [zabbix]> Alter table history_text drop primary key, add index (id), drop index history_text_2, add index history_text_2 (itemid, id); Query OK, 0 rows affected (1.11 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [zabbix]> Alter table history_log drop primary key, add index (id), drop index history_log_2, add index history_log_2 (itemid, id); Query OK, 0 rows affected (0.14 sec) Records: 0 Duplicates: 0 Warnings: 0
修改完之后再按照官網上的過程創建四個存儲過程:
3、將官方的四個分散代碼拷貝至一個文件保存為sql,導入數據庫;

cat /root/zabbix-partition.sql DELIMITER $$ CREATE PROCEDURE `partition_create`(SCHEMANAME varchar(64), TABLENAME varchar(64), PARTITIONNAME varchar(64), CLOCK int) BEGIN /* SCHEMANAME = The DB schema in which to make changes TABLENAME = The table with partitions to potentially delete PARTITIONNAME = The name of the partition to create */ /* Verify that the partition does not already exist */ DECLARE RETROWS INT; SELECT COUNT(1) INTO RETROWS FROM information_schema.partitions WHERE table_schema = SCHEMANAME AND table_name = TABLENAME AND partition_description >= CLOCK; IF RETROWS = 0 THEN /* 1. Print a message indicating that a partition was created. 2. Create the SQL to create the partition. 3. Execute the SQL from #2. */ SELECT CONCAT( "partition_create(", SCHEMANAME, ",", TABLENAME, ",", PARTITIONNAME, ",", CLOCK, ")" ) AS msg; SET @sql = CONCAT( 'ALTER TABLE ', SCHEMANAME, '.', TABLENAME, ' ADD PARTITION (PARTITION ', PARTITIONNAME, ' VALUES LESS THAN (', CLOCK, '));' ); PREPARE STMT FROM @sql; EXECUTE STMT; DEALLOCATE PREPARE STMT; END IF; END$$ DELIMITER ; DELIMITER $$ CREATE PROCEDURE `partition_drop`(SCHEMANAME VARCHAR(64), TABLENAME VARCHAR(64), DELETE_BELOW_PARTITION_DATE BIGINT) BEGIN /* SCHEMANAME = The DB schema in which to make changes TABLENAME = The table with partitions to potentially delete DELETE_BELOW_PARTITION_DATE = Delete any partitions with names that are dates older than this one (yyyy-mm-dd) */ DECLARE done INT DEFAULT FALSE; DECLARE drop_part_name VARCHAR(16); /* Get a list of all the partitions that are older than the date in DELETE_BELOW_PARTITION_DATE. All partitions are prefixed with a "p", so use SUBSTRING TO get rid of that character. */ DECLARE myCursor CURSOR FOR SELECT partition_name FROM information_schema.partitions WHERE table_schema = SCHEMANAME AND table_name = TABLENAME AND CAST(SUBSTRING(partition_name FROM 2) AS UNSIGNED) < DELETE_BELOW_PARTITION_DATE; DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; /* Create the basics for when we need to drop the partition. Also, create @drop_partitions to hold a comma-delimited list of all partitions that should be deleted. */ SET @alter_header = CONCAT("ALTER TABLE ", SCHEMANAME, ".", TABLENAME, " DROP PARTITION "); SET @drop_partitions = ""; /* Start looping through all the partitions that are too old. */ OPEN myCursor; read_loop: LOOP FETCH myCursor INTO drop_part_name; IF done THEN LEAVE read_loop; END IF; SET @drop_partitions = IF(@drop_partitions = "", drop_part_name, CONCAT(@drop_partitions, ",", drop_part_name)); END LOOP; IF @drop_partitions != "" THEN /* 1. Build the SQL to drop all the necessary partitions. 2. Run the SQL to drop the partitions. 3. Print out the table partitions that were deleted. */ SET @full_sql = CONCAT(@alter_header, @drop_partitions, ";"); PREPARE STMT FROM @full_sql; EXECUTE STMT; DEALLOCATE PREPARE STMT; SELECT CONCAT(SCHEMANAME, ".", TABLENAME) AS `table`, @drop_partitions AS `partitions_deleted`; ELSE /* No partitions are being deleted, so print out "N/A" (Not applicable) to indicate that no changes were made. */ SELECT CONCAT(SCHEMANAME, ".", TABLENAME) AS `table`, "N/A" AS `partitions_deleted`; END IF; END$$ DELIMITER ; DELIMITER $$ CREATE PROCEDURE `partition_maintenance`(SCHEMA_NAME VARCHAR(32), TABLE_NAME VARCHAR(32), KEEP_DATA_DAYS INT, HOURLY_INTERVAL INT, CREATE_NEXT_INTERVALS INT) BEGIN DECLARE OLDER_THAN_PARTITION_DATE VARCHAR(16); DECLARE PARTITION_NAME VARCHAR(16); DECLARE OLD_PARTITION_NAME VARCHAR(16); DECLARE LESS_THAN_TIMESTAMP INT; DECLARE CUR_TIME INT; CALL partition_verify(SCHEMA_NAME, TABLE_NAME, HOURLY_INTERVAL); SET CUR_TIME = UNIX_TIMESTAMP(DATE_FORMAT(NOW(), '%Y-%m-%d 00:00:00')); SET @__interval = 1; create_loop: LOOP IF @__interval > CREATE_NEXT_INTERVALS THEN LEAVE create_loop; END IF; SET LESS_THAN_TIMESTAMP = CUR_TIME + (HOURLY_INTERVAL * @__interval * 3600); SET PARTITION_NAME = FROM_UNIXTIME(CUR_TIME + HOURLY_INTERVAL * (@__interval - 1) * 3600, 'p%Y%m%d%H00'); IF(PARTITION_NAME != OLD_PARTITION_NAME) THEN CALL partition_create(SCHEMA_NAME, TABLE_NAME, PARTITION_NAME, LESS_THAN_TIMESTAMP); END IF; SET @__interval=@__interval+1; SET OLD_PARTITION_NAME = PARTITION_NAME; END LOOP; SET OLDER_THAN_PARTITION_DATE=DATE_FORMAT(DATE_SUB(NOW(), INTERVAL KEEP_DATA_DAYS DAY), '%Y%m%d0000'); CALL partition_drop(SCHEMA_NAME, TABLE_NAME, OLDER_THAN_PARTITION_DATE); END$$ DELIMITER ; DELIMITER $$ CREATE PROCEDURE `partition_verify`(SCHEMANAME VARCHAR(64), TABLENAME VARCHAR(64), HOURLYINTERVAL INT(11)) BEGIN DECLARE PARTITION_NAME VARCHAR(16); DECLARE RETROWS INT(11); DECLARE FUTURE_TIMESTAMP TIMESTAMP; /* * Check if any partitions exist for the given SCHEMANAME.TABLENAME. */ SELECT COUNT(1) INTO RETROWS FROM information_schema.partitions WHERE table_schema = SCHEMANAME AND table_name = TABLENAME AND partition_name IS NULL; /* * If partitions do not exist, go ahead and partition the table */ IF RETROWS = 1 THEN /* * Take the current date at 00:00:00 and add HOURLYINTERVAL to it. This is the timestamp below which we will store values. * We begin partitioning based on the beginning of a day. This is because we don't want to generate a random partition * that won't necessarily fall in line with the desired partition naming (ie: if the hour interval is 24 hours, we could * end up creating a partition now named "p201403270600" when all other partitions will be like "p201403280000"). */ SET FUTURE_TIMESTAMP = TIMESTAMPADD(HOUR, HOURLYINTERVAL, CONCAT(CURDATE(), " ", '00:00:00')); SET PARTITION_NAME = DATE_FORMAT(CURDATE(), 'p%Y%m%d%H00'); -- Create the partitioning query SET @__PARTITION_SQL = CONCAT("ALTER TABLE ", SCHEMANAME, ".", TABLENAME, " PARTITION BY RANGE(`clock`)"); SET @__PARTITION_SQL = CONCAT(@__PARTITION_SQL, "(PARTITION ", PARTITION_NAME, " VALUES LESS THAN (", UNIX_TIMESTAMP(FUTURE_TIMESTAMP), "));"); -- Run the partitioning query PREPARE STMT FROM @__PARTITION_SQL; EXECUTE STMT; DEALLOCATE PREPARE STMT; END IF; END$$ DELIMITER ;
[root@Zabbix-Server ~]# mysql -u zabbix -p zabbix Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 48790 Server version: 5.5.52-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [zabbix]> use zabbix; Database changed MariaDB [zabbix]> source /root/zabbix-partition.sql; Query OK, 0 rows affected (0.04 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) MariaDB [zabbix]> CALL partition_maintenance('zabbix', 'history_log', 28, 24, 14); +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801100000,1515600000) | +---------------------------------------------------------------+ 1 row in set (0.18 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801110000,1515686400) | +---------------------------------------------------------------+ 1 row in set (0.48 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801120000,1515772800) | +---------------------------------------------------------------+ 1 row in set (0.67 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801130000,1515859200) | +---------------------------------------------------------------+ 1 row in set (1.02 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801140000,1515945600) | +---------------------------------------------------------------+ 1 row in set (1.22 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801150000,1516032000) | +---------------------------------------------------------------+ 1 row in set (1.44 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801160000,1516118400) | +---------------------------------------------------------------+ 1 row in set (1.64 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801170000,1516204800) | +---------------------------------------------------------------+ 1 row in set (1.85 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801180000,1516291200) | +---------------------------------------------------------------+ 1 row in set (2.04 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801190000,1516377600) | +---------------------------------------------------------------+ 1 row in set (2.23 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801200000,1516464000) | +---------------------------------------------------------------+ 1 row in set (2.42 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801210000,1516550400) | +---------------------------------------------------------------+ 1 row in set (2.62 sec) +---------------------------------------------------------------+ | msg | +---------------------------------------------------------------+ | partition_create(zabbix,history_log,p201801220000,1516636800) | +---------------------------------------------------------------+ 1 row in set (2.85 sec) +--------------------+--------------------+ | table | partitions_deleted | +--------------------+--------------------+ | zabbix.history_log | N/A | +--------------------+--------------------+ 1 row in set (3.10 sec) Query OK, 0 rows affected, 1 warning (3.10 sec)
4、對想要分區的表進行表分區
DELIMITER $$ CREATE PROCEDURE `partition_maintenance_all`(SCHEMA_NAME VARCHAR(32)) BEGIN CALL partition_maintenance(SCHEMA_NAME, 'history', 7, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'history_log', 7, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'history_str', 7, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'history_text', 7, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'history_uint', 7, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'trends', 365, 24, 14); CALL partition_maintenance(SCHEMA_NAME, 'trends_uint', 365, 24, 14); END$$ DELIMITER ;
以上代碼部分的含義為(庫名,表名,保存多少天的數據,每隔多久生成一個分區,本次生成多少分區)
mysql> source /root/partition_maintenance_all.sql; Query OK, 0 rows affected (0.00 sec) mysql> CALL partition_maintenance_all('zabbix');
5、Housekeeper 設置
Zabbix用戶界面中的 "Administration" -> "部分提供了所有選項。確保在右上角的下拉列表中選擇"Housekeeping" 您應該看到類似於以下的屏幕:
- 確保“歷史”和“趨勢”兩個選項的“啟用內部管理”復選框未被選中。
- 確保歷史和趨勢的檢查標題為“覆蓋項目<趨勢/歷史>期間”復選框。
- 將歷史和趨勢的“數據存儲期限(天數)”框設置為您保留兩者的時間。在上面給出的表分區中,正確的值是7和365。
6、加入計划任務
不要讓數據庫用完你的分區,上面示例是如何創建14天額外分區的,在第15天,數據庫將無法粘貼歷史/趨勢數據,因此會發生數據丟失。
所以每隔一段時間(通過cron或其他方法)重新運行這些存儲過程。通過這樣做,分區將始終存在,可以插入數據。
#Q-2018-1/9
30 4 * * 1 /usr/bin/mysql -uzabbix -pzabbix -e "use zabbix;" -e "CALL partition_maintenance_all('zabbix');"
實際在生產環境中上述操作運行一段時間后,Zabbix server的log文件會報如下錯誤,events表主鍵重復、主鍵不能自動增長、導致zabbix不能告警
2581:20180208:213930.461 [Z3005] query failed: [1062] Duplicate entry '8703' for key 'PRIMARY' [insert into events (eventid,source,object,objectid,clock,ns,value) values (8703,0,0,19518,1518097170,457297996,1);
可以使用如下命令刪除events記錄
[root@Zabbix-Server zabbix]# mysql -u zabbix -pzabbix -e "use zabbix;" -e 'delete from events';
如果想要刪除表的所有數據,truncate語句要比 delete 語句快。
因為 truncate 刪除了表,然后根據表結構重新建立它,而 delete 刪除的是記錄,並沒有嘗試去修改表。
不過truncate命令雖然快,卻不像delete命令那樣對事務處理是安全的。
另外注意的是mysql數據庫清空表默認是不回收空間的(對應步驟1)
回收表空間的命令
optimize table history
optimize table history_uint
針對MySQL的不同數據庫存儲引擎,在optimize使用清除碎片,回收閑置的數據庫空間,把分散存儲(fragmented)的數據和索引重新挪到一起(defragmentation),對I/O速度有好處。
當然optimize在對表進行操作的時候,會加鎖,所以不宜經常在程序中調用。可以參考http://www.cnblogs.com/w787815/p/8433548.html
zabbix社區文檔參考
https://www.zabbix.org/wiki/Docs/howto/mysql_partition