數據庫索引調優技巧

本文轉載自查看原文 2020-12-29 15:12 534 T.MySQL

前面介紹了數據庫優化索引，這里我們介紹數據庫索引調優

長字段的索引調優

使用組合索引的技巧

覆蓋索引

排序優化

冗余、重復索引的優化

1、長字段的索引調優

selelct * from employees where first_name = ' Facello' 假設 first_name 的字段長度很長，如大於200個字符，那么索引占用的空間也會很大，

作用在超長字段的索引查詢效率也不高。

解決方法：額外創建個字段，比如first_name_hash int default 0 not null. first_name的hashcode

insert into employees value (999999, now(), 'zhangsan...','zhang','M',now(), CRC32('zhangsan...'));

first_name_hash的值應該具備以下要求

1) 字段的長度應該比較小，SHA1和MD5是不合適的

2) 應當盡量避免hash沖突，就目前來說，流行使用CRC32(),或者FNV64()

修改后的SQL selelct * from employees where first_name_hash = CRC32(zhangsan...) and first_name = ' Facello'

並且給 first_name_hash設置所有，並帶上 first_name = ' Facello' 為了解決hash沖突也能返回正確的結果。

--長字段調優

selelct * from employees where first_name like ' Facello%'

如果是like，就不能使用上面的調優方法。

解決方法：前綴索引

alter table employees add key (first_name(5)) 這里的5是如何確定的，能不能其它數字呢？

索引選擇性 = 不重復的索引值/數據表的總記錄數

數值越大，表示選擇性越高，性能越好。

select count(distince first_name)/count(*) from employees; -- 返回的值為0。0043 完整列的選擇性 0.0043 【這個字段的最大選擇性】

select count(distinct left(first_name,5)) / count(*) from  employees;  -- 返回結果 0.0038
select count(distinct left(first_name,6)) / count(*) from  employees;  -- 返回結果 0.0041
select count(distinct left(first_name,7)) / count(*) from  employees;  -- 返回結果 0.0042
select count(distinct left(first_name,8)) / count(*) from  employees;  -- 返回結果 0.0042
select count(distinct left(first_name,9)) / count(*) from  employees;  -- 返回結果 0.0042
select count(distinct left(first_name,10)) / count(*) from  employees;  -- 返回結果 0.0042
select count(distinct left(first_name,11)) / count(*) from  employees;  -- 返回結果 0.0043，說明 為大於等於11時，返回 0.0043
select count(distinct left(first_name,12)) / count(*) from  employees;  -- 返回結果 0.0043

　說明為大於等於11時，返回 0.0043

結論：前綴索引的長度設置為11

alter table employees add key (first_name(11))

優點：前綴索引可以讓索引更小，更加高效，而且對上層應用是透明的。應用不需要做任何改造，使用成本較低。

這是一種比較容易落地的優化方案。

局限性：無法做order by、group by；無法使用覆蓋索引。

使用場景：后綴索引，MySql是沒有后綴索引的

額外創建一個字段，比如說first_name_reverse, 在存儲的時候，把first_name的值翻轉過來再存儲。

比方Facello 變成 ollecaF存儲到first_name_reverse

2、單例索引 vs 組合索引

explain select * from salaries where from_date = '1986-06-26' and to_date = '1987-06-26';

salaries表沒有索引， explain后type為All All為全表掃描，查詢時間為1s537ms

創建兩個索引(單例索引)

type為index_merge，

Extra為 Using intersect(salaries_to_date_index,salaries_from_date_index); Using where

查詢時間為229ms

修改成組合索引index(from_date, to_date)

type : 為const 。說明組合索引比上面的單例索引性能好一些。

查詢時間為215ms。性能差異不大

總結：

SQL存在多個條件，多個單列索引，會使用索引合並

如果出現索引合並，往往說明索引不夠合理。

如果SQL暫時沒有性能問題，暫時可以不管。

組合索引要注意索引列順序[最左前綴原則]

補充：

3、覆蓋索引

什么是覆蓋索引：對應索引X，SELECT的字段只需從索引就能獲得，而無需到表數據里獲取，這樣的索引就叫覆蓋索引。

索引index(from_date, to_date)

索引無法覆蓋查詢字段時

explain select * from salaries where from_date = '1986-06-26' and to_date = '1987-06-26';

type: ref

rows: 86

extra: null

索引能覆蓋查詢字段時

explain select from_date, to_date from salaries where from_date = '1986-06-26' and to_date = '1987-06-26';

type: ref

rows: 86

extra: using index

使用覆蓋索引時，並不會改變SQL的執行過程。但是extra會顯示using index

總結：

覆蓋索引能提交SQL的性能

Select盡量只返回想要的字段(使用覆蓋索引，減少網絡傳輸的開銷)

4、重復索引

索引是有開銷的。增刪改的時候，索引的維護開銷。索引越多，開銷越大。條件允許的情況下，盡量少創建索引。

重復索引：

在相同的列上按照相同的順序創建的索引。

create table test_table(
    id int not null primary key auto_increment,
    a int not null ,
    b int not null ,
    unique (id),
    index (id)
) ENGINE = InnoDB;

主鍵，唯一索引，普通索引。唯一索引在普通索引的基礎上，增加了唯一性約束。主鍵在唯一索引的基礎上增加了非空約束。相對於在Id的字段上創建了三個重復的索引。一般來說，重復索引是需要避免的。
如果發現有重復索引，也應該刪掉重復索引。

上面發生了重復索引，改進方案：

create table test_table(
    id int not null primary key auto_increment,
    a int not null ,
    b int not null 
) ENGINE = InnoDB;

刪除唯一索引和普通索引，值保留主鍵索引。

5、冗余索引(針對B-Tree和B+Tree來說的)

如果已經存在索引index(A,B), 又創建了index(A), 那么index(A) 就是index(A,B)的冗余索引。

一般要避免冗余索引。但有特例，一定要避免掉進陷阱里。

explain select * from salaries where from_date = '1986-06-26' order by emp_no;

索引index(from_date): type=ref extra=null。使用了索引。

索引index(from_date) 某種意義上來說就相當於index(from_date, emp_no)

修改索引index(from_date,to_date)再次執行：

explain select * from salaries where from_date = '1986-06-26' order by emp_no;

索引index(from_date,to_date) type=ref extra=Using filesort 說明order by子句無法使用索引。

索引index(from_date, to_date) 某種意義上來說就相當於index(from_date, to_date, emp_no), 不符合最左前綴原則，所以order by子句無法使用索引。

6、未使用的索引

某個索引根本未曾使用

累贅，刪除。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數據庫調優的方法 MySQL數據庫配置調優數據庫性能調優(轉) SQL Server調優系列進階篇（如何維護數據庫索引）性能調優 | 如何通過性能調優突破 MySQL 數據庫性能瓶頸？數據庫性能調優之始： analyze統計信息數據庫SQL調優的幾種方式(轉) Oracle數據庫應用系統調優的方法數據庫連接池調優性能測試之Mysql數據庫調優