加個order by使得查詢效率提高500倍


很簡單的三個表:

p248_user記錄用戶信息

CREATE TABLE `p248_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`list_ids` varchar(4000) NOT NULL DEFAULT '',
`email` varchar(255) NOT NULL,
`mobile` varchar(20) NOT NULL,
`_created` datetime NOT NULL,
`_updated` datetime NOT NULL,
`hb_status` tinyint(4) DEFAULT '0',
`sb_status` tinyint(4) DEFAULT '0',
`unsubscribe_email_status` tinyint(4) DEFAULT '0',
`unsubscribe_sms_status` tinyint(4) DEFAULT '0',
`hb_time` datetime DEFAULT NULL,
`unsubscribe_email_time` datetime DEFAULT NULL,
`unsubscribe_sms_time` datetime DEFAULT NULL,
`_create_operator_name` varchar(100) DEFAULT NULL,
`_update_operator_name` varchar(100) DEFAULT NULL,
`_create_operator_email` varchar(100) DEFAULT NULL,
`_update_operator_email` varchar(100) DEFAULT NULL,
`name` varchar(255) NOT NULL DEFAULT '',
`time` varchar(255) NOT NULL DEFAULT '',
`year` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `u1` (`email`,`mobile`) USING BTREE,
KEY `_updated` (`_updated`),
KEY `mobile` (`mobile`)
) ENGINE=InnoDB AUTO_INCREMENT=5596286 DEFAULT CHARSET=utf8

p248_list記錄組信息

CREATE TABLE `p248_list` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`status` enum('active','delete') DEFAULT 'active',
`_created` datetime NOT NULL,
`_updated` datetime NOT NULL,
`user_count` int(11) DEFAULT '0',
`lock_status` int(11) NOT NULL DEFAULT '0',
`lock_reason` varchar(100) DEFAULT NULL,
`lock_time` datetime DEFAULT NULL,
`import_percent` int(11) DEFAULT NULL,
`hb_count` int(11) DEFAULT '0',
`sb_count` int(11) DEFAULT '0',
`unsubscribe_email_count` int(11) DEFAULT '0',
`unsubscribe_sms_count` int(11) DEFAULT '0',
`_create_operator_name` varchar(100) DEFAULT NULL,
`_update_operator_name` varchar(100) DEFAULT NULL,
`_create_operator_email` varchar(100) DEFAULT NULL,
`_update_operator_email` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `_updated` (`_updated`)
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=utf8

p248_user_list是個多對多的表,記錄用戶屬於哪些組

CREATE TABLE `p248_user_list` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`list_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_list_id` (`user_id`,`list_id`),
KEY `list_id` (`list_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5646298 DEFAULT CHARSET=utf8

 

 

p248_user有200萬條記錄, p248_user_list有1000萬條記錄。

 

現在要找出屬於29分組,並且手機號碼不為空,並且沒有退訂的用戶。這樣的用戶大約有100萬個。現在要把這些用戶按照4000個一批放到一群臨時的記錄集里。

 

這個要用到分頁了,一開始的想法:

第一頁:

SELECT `id`, `email`, `mobile`, `_created`, `_updated`, `_create_operator_name`, `_update_operator_name`, `name`, `time`, `year` FROM `p248_user` WHERE 1 = 1 AND id IN (SELECT DISTINCT user_id FROM `p248_user_list` WHERE list_id IN (29)) AND unsubscribe_sms_status = 0 AND mobile <> '' LIMIT 0, 4000;

第二頁就LIMIT 4000, 4000。第三頁就LIMIT 8000, 4000。依次類推。

結果這個SQL查詢耗時用了整整5秒。

 

分析一下這個查詢:

mysql> explain SELECT `id`, `email`, `mobile`, `_created`, `_updated`, `_create_operator_name`, `_update_operator_name`, `name`, `time`, `year` FROM `p248_user` WHERE 1 = 1 AND id IN (SELECT DISTINCT user_id FROM `p248_user_list` WHERE list_id IN (29)) AND unsubscribe_sms_status = 0 AND mobile <> '' LIMIT 0, 4000;
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+--------+------------------------------------+
| 1 | SIMPLE | p248_user | range | PRIMARY,mobile | mobile | 62 | NULL | 934446 | Using index condition; Using where |
| 1 | SIMPLE | p248_user_list | eq_ref | user_list_id,list_id | user_list_id | 8 | contacts.p248_user.id,const | 1 | Using index |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+--------+------------------------------------+
2 rows in set (0.00 sec)

可以看到用戶表掃描了93萬行,幾乎是全表掃描了。也就是把所有符合條件的結果都取了出來然后再取前4000條。

 

 

把上面的查詢加上了ORDER BY `id`,結果查詢耗時僅0.01秒,查詢速度足足提高了500倍。

為什么會這樣呢?

分析一下新的查詢:

mysql> explain SELECT `id`, `email`, `mobile`, `_created`, `_updated`, `_create_operator_name`, `_update_operator_name`, `name`, `time`, `year` FROM `p248_user` WHERE 1 = 1 AND id IN (SELECT DISTINCT user_id FROM `p248_user_list` WHERE list_id IN (29)) AND unsubscribe_sms_status = 0 AND mobile <> '' ORDER BY `id` LIMIT 0, 4000;
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+------+-------------+
| 1 | SIMPLE | p248_user | index | PRIMARY,mobile | PRIMARY | 4 | NULL | 7999 | Using where |
| 1 | SIMPLE | p248_user_list | eq_ref | user_list_id,list_id | user_list_id | 8 | contacts.p248_user.id,const | 1 | Using index |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+------+-------------+
2 rows in set (0.00 sec)

這次用戶表僅掃描了8000行。也就是查詢先使用了主鍵索引,掃描完前4000條符合條件的記錄就直接結束了。

 

 

那取第二頁呢:

mysql> explain SELECT `id`, `email`, `mobile`, `_created`, `_updated`, `_create_operator_name`, `_update_operator_name`, `name`, `time`, `year` FROM `p248_user` WHERE 1 = 1 AND id IN (SELECT DISTINCT user_id FROM `p248_user_list` WHERE list_id IN (29)) AND unsubscribe_sms_status = 0 AND mobile <> '' ORDER BY `id` LIMIT 4000, 4000;
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+-------+-------------+
| 1 | SIMPLE | p248_user | index | PRIMARY,mobile | PRIMARY | 4 | NULL | 15999 | Using where |
| 1 | SIMPLE | p248_user_list | eq_ref | user_list_id,list_id | user_list_id | 8 | contacts.p248_user.id,const | 1 | Using index |
+----+-------------+----------------+--------+----------------------+--------------+---------+-----------------------------+-------+-------------+
2 rows in set (0.00 sec)

這次就要掃描16000行了,因為前4000條是第一頁的沒用扔掉了。

 

 

這樣的話頁數越大查詢就會越耗時。

 

 

但實際上可以換個方法:

第一次查詢結束時,得到最后一條記錄的user id, 比如是6500。

第二次查詢的時候用這個user_id作為條件去匹配

SELECT `id`, `email`, `mobile`, `_created`, `_updated`, `_create_operator_name`, `_update_operator_name`, `name`, `time`, `year` FROM `p248_user` WHERE id > 6500 AND id IN (SELECT DISTINCT user_id FROM `p248_user_list` WHERE list_id IN (29)) AND unsubscribe_sms_status = 0 AND mobile <> '' ORDER BY `id` LIMIT 0, 4000;

這樣掃描的行數和第一頁依然是一樣的。

直到最后一頁也是如此,耗時不會有任何明顯的下降。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM