mysql千萬級內模糊查詢的實現方式

本文轉載自查看原文 2021-08-19 16:12 175 模糊查詢/ mysql

昨晚輾轉反側，靈光閃現，突然想到了覆蓋索引+主動回表的方式，管你幾個字段，我只要一個普通索引。

所以千萬級大表的like模糊查詢能不能做？

能

廢話不多說，那就搞一搞。

建表

create table emp
(
    id       int unsigned auto_increment
        primary key,
    empno    mediumint unsigned default 0  not null,
    ename    varchar(20)        default '' not null,
    job      varchar(9)         default '' not null,
    mgr      mediumint unsigned default 0  not null,
    hiredate date                          not null,
    sal      decimal(7, 2)                 not null,
    comm     decimal(7, 2)                 not null,
    deptno   mediumint unsigned default 0  not null
)
    charset = utf8;

導入千萬級數據

方法在這里

bigdata> select count(*) from emp
[2021-08-19 11:08:25] 1 row retrieved starting from 1 in 2 s 900 ms (execution: 2 s 874 ms, fetching: 26 ms)

未建索引下的模糊查詢

bigdata> select ename, empno, job from emp where ename like '%S%'
[2021-08-19 11:14:25] 2,765,363 rows retrieved starting from 1 in 9 s 360 ms (execution: 8 ms, fetching: 9 s 352 ms)

僅右模糊的就不考慮了，都知道是走索引的。

上法寶，覆蓋索引

不幸的是，直接卡在了創建索引這一步，因為表已經千萬數據了，直接建索引機器就卡死了，順便搜索了一下解決方案，總結的很好，但是我不用😄我直接truncate刪除表索引和數據

檢查索引/表是否刪除干凈

use information_schema;
# 查看指定庫的指定表的大小
select concat(round(sum(DATA_LENGTH/1024/1024),2),'MB') as data  from TABLES where table_schema='bigdata' and table_name='emp';
# 查看指定庫的指定表的索引大小
SELECT CONCAT(ROUND(SUM(index_length)/(1024*1024), 2), 'MB') AS 'Total Index Size' FROM TABLES  WHERE table_schema = 'bigdata' and table_name='emp';

創建索引

create index emp_ename_idx on emp (ename);

再次導入數據

Call insert_emp10000(0,10000000);
[2021-08-19 14:18:53] completed in 2 h 22 m 37 s 90 ms

時間有夠長的。。

嘗試一下有索引的like模糊

bigdata> select ename from emp where ename like '%S%'
[2021-08-19 14:37:40] 2,093,321 rows retrieved starting from 1 in 5 s 128 ms (execution: 34 ms, fetching: 5 s 94 ms)

覆蓋索引，性能提升

可以用desc/explain確認一下走了索引，原理不說了吧，覆蓋索引

對比上面可以發現，使用覆蓋索引后性能提升了一倍

但你可能說：就這？就這？這有卵用，誰查詢時也不可能只查一個字段呀，但是把要查詢的字段都加上索引又不現實，畢竟索引也需要空間存儲的，給要返回的字段都加上索引，可能光索引就比表數據大N倍了。

那咋整？

實不相瞞，這就是我昨晚思考到的，以至於興奮得夜不能寐。

關鍵在於這樣：

bigdata> select id, ename from emp where ename like '%S%'
[2021-08-19 14:48:11] 2,093,321 rows retrieved starting from 1 in 4 s 685 ms (execution: 9 ms, fetching: 4 s 676 ms)

沒錯，就多了個id（或者直接返回id也是可以的，不不不，理論上應該僅返回id，可避免mysql回表）

id有什么用？id能精確查找數據鴨！（有沒有覺得很像外部的“主動回表”）

就像這樣，二次查詢

bigdata> select id, ename... from emp where id in (497723, 670849, 1371884, 1934742, 1960444, 2165983)
[2021-08-19 15:45:23] 6 rows retrieved starting from 1 in 78 ms (execution: 23 ms, fetching: 55 ms)

這速度不就有了，數據也有了。

基於此，還可以實現內存分頁，且基本不用擔心內存溢出問題

再搞個緩存，性能又能進一步提升，不過代價也很明顯，復雜度進一步提升

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 mysql千萬級內模糊查詢的實現方式模糊查詢的幾種實現方式 MyBatis系列：模糊查詢的4種實現方式 input動態模糊查詢的實現方式 MySQL簡單實現多字段模糊查詢 Mysql 之實現多字段模糊查詢 mysql模糊查詢 mysql 模糊查詢 mysql表的模糊查詢 MySql 模糊查詢