MySQL死鎖案例分析


最近項目中某個模塊穩定復現MySQL死鎖問題,本文記錄死鎖的發生原因以及解決辦法。

1. 預備知識

1.1 表鎖和行鎖

  • 表鎖

表鎖是MySQL中最基本的鎖策略,並且是開銷最小的策略。表鎖會鎖定整張數據表,用戶的寫操作(插入/刪除/更新)前,都需要獲取寫鎖(寫鎖會相互阻塞);沒有寫鎖時,讀取用戶才能獲取讀鎖(讀鎖不會相互阻塞)。

  • 行鎖(僅限定於InnoDB)

行級鎖可以最大程度的支持並發處理(同時也帶來了最大的鎖開銷)。行級鎖只在存儲引擎實現,而MySQL服務器層沒有實現。服務器層完全不了解存儲引擎中的具體實現。

1.2 行鎖簡介

行鎖的模式有:讀/寫意向鎖(IS/IX鎖),讀鎖(S鎖),寫鎖(X鎖)以及自增鎖(AI)。

行鎖根據場景的不同又可以進一步細分,依次為Next-Key Lock,Gap Lock間隙鎖,Record Lock記錄鎖和插入意向GAP鎖。不同的鎖鎖定的位置是不同的,比如說記錄鎖只鎖住對應的記錄,而間隙鎖鎖住記錄和記錄之間的間隔,Next-Key Lock鎖住記錄和記錄之前的間隙。不同類型鎖的鎖定范圍大致如下圖所示。

avatar

此外,鎖對應的死鎖日志信息標記如下所示:

  • 記錄鎖(LOCK_REC_NOT_GAP): lock_mode X locks rec but not gap
  • 間隙鎖(LOCK_GAP): lock_mode X locks gap before rec
  • Next-key鎖(LOCK_ORNIDARY): lock_mode X
  • 插入意向鎖(LOCK_INSERT_INTENTION): lock_mode X locks gap before rec insert intention

1.3 行鎖加鎖示例

InnoDB是聚簇索引,也就是B+樹的葉子節點存儲了主鍵索引以及數據行;InnoDB的二級索引的葉子節點存儲的則是主鍵值,所以通過二級索引查詢數據時,需要根據查詢到的主鍵去聚簇索引中再次進行查詢。

update user set age = 10 where id = 49;
update user set age = 10 where name = 'Tom';

(1)第一條SQL使用主鍵進行查詢,則只需要在id=49主鍵上加上寫鎖(X鎖);
(2)第二條SQL使用二級索引查詢,首先在name='Tom'上加寫鎖,然后根據獲取的主鍵索引查詢,在id=49主鍵上添加寫鎖。

具體如下圖所示:
avatar

以上是基於單條數據討論,針對多條數據:

update user set age = 10 where id > 49;

執行步驟:

(1)MySQL Server根據where條件讀取滿足條件的第一條記錄,InnoDB引擎返回行記錄並加鎖;
(2)MySQL Server發起更新行記錄的update請求,更新此記錄;
(3)反復循環(1)(2)步驟,直到所有滿足條件的記錄均被修改。

具體如下圖所示:
avatar

2.准備工作

2.1 創建數據表並初始化

create table dead_lock_test
(
    id int auto_increment
        primary key,
    v1 int not null,
    v2 int not null
);

insert into  dead_lock_test (v1,v2) value (1,1);
insert into  dead_lock_test (v1,v2) value (2,2);
insert into  dead_lock_test (v1,v2) value (3,3);

需要注意,數據表中僅存在主鍵索引。此外,默認數據庫引擎為InnoDB,事務隔離級別為RR(可重復讀,相對於RC解決了幻讀)。

2.2 開啟鎖監控

使用如下語句,開啟MySQL鎖監控:

# 開啟
set GLOBAL innodb_status_output=ON;
set GLOBAL innodb_status_output_locks=ON;

# 關閉
set GLOBAL innodb_status_output_locks=OFF;

3.場景復現

開啟兩個數據庫連接,分別執行如下SQL語句:

# session1
start transaction ;
insert into  dead_lock_test (v1,v2) value (4,4);
delete from dead_lock_test where v1 = 4 and v2 = 4;
commit;

# session2
start transaction;
insert into  dead_lock_test (v1,v2) value (5,5);
delete from dead_lock_test where v1 = 5 and v2 = 5;
commit;

不要問事務里就兩條SQL,插入后刪除走回滾就可以了之類的問題(我也不知道為什么這么寫的)。

事務執行步驟如下表所示:

session1 session2 stage 備注
start transaction; start transaction;
insert into dead_lock_test (v1,v2) value (4,4); do nothing 執行成功
do nothing insert into dead_lock_test (v1,v2) value (5,5); stage1 執行成功
delete from dead_lock_test where v1 = 4 and v2 = 4; do nothing stage2 session1執行結果阻塞
do nothing delete from dead_lock_test where v1 = 5 and v2 = 5; stage3 session2執行結果報deadlock

3.1 stage1

執行show engine innodb status;節選事務信息如下所示:

------------
TRANSACTIONS
------------
Trx id counter 91328
Purge done for trx's n:o < 91327 undo n:o < 0 state: running but idle
History list length 19
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 91327, ACTIVE 37 sec
1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 1
MySQL thread id 24, OS thread handle 15668, query id 3147 localhost 127.0.0.1 root
TABLE LOCK table `igw_proxy_rule_management`.`dead_lock_test` trx id 91327 lock mode IX
---TRANSACTION 91322, ACTIVE 44 sec
1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 1
MySQL thread id 23, OS thread handle 22788, query id 3103 localhost 127.0.0.1 root
TABLE LOCK table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock mode IX

輸出內容中節選當前事務信息,當前存在兩個運行中事務,trx id分別為91322以及91327。

  • TABLE LOCK table igw_proxy_rule_management.dead_lock_test trx id 91322 lock mode IX: dead_lock_test表上添加IX鎖。

91322事務對應session1,91327事務對應session2.

3.2 stage2

執行delete from dead_lock_test where v1 = 4 and v2 = 4;后可發現,當前事務被阻塞。
執行show engine innodb status;節選事務信息如下所示:

因為輸出內容較多,直接在輸出內容中添加注釋進行解析。

------------
TRANSACTIONS
------------
Trx id counter 91332
Purge done for trx's n:o < 91332 undo n:o < 0 state: running but idle
History list length 21
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 91327, ACTIVE 58 sec
* 2 lock strcut(s): 事務91327中鎖鏈表長度為2(每個鏈表節點表示該事務持有的一個鎖結構,包括表鎖/記錄鎖等),當前事務包含表鎖(IX)以及一個行鎖(記錄鎖);
* 1 row lock(s):當前事務持有的行鎖個數;
* undo log entries 1:當前事務的undo log個數
2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1
MySQL thread id 24, OS thread handle 15668, query id 3147 localhost 127.0.0.1 root
* TABLE LOCK:當前事務持有的表鎖(IX)
TABLE LOCK table `igw_proxy_rule_management`.`dead_lock_test` trx id 91327 lock mode IX
* RECORD LOCKS:當前事務持有的行鎖(lock_mode X locks rec but not gap)
* space id 92: dead_lock_test所在空間編號
* page no 4: 當前記錄所在頁碼
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91327 lock_mode X locks rec but not gap
* 行鎖信息: heap no=6
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0

 0: len 4; hex 80000005; asc     ;; * hex 80000005:當前加鎖的記錄id=5
 1: len 6; hex 0000000164bf; asc     d ;;   * hex 0000000164bf: 事務ID;
 2: len 7; hex 81000000b20110; asc        ;;    * hex 81000000b20110: 回滾指針;
 3: len 4; hex 80000005; asc     ;; * hex 80000005: v1字段對應數值;
 4: len 4; hex 80000005; asc     ;; * hex 80000005:v2字段對應數值;

---TRANSACTION 91322, ACTIVE 65 sec fetching rows
* tables in use 1: 有1個表正在被使用;
* locked 1: 有一個表鎖
mysql tables in use 1, locked 1
* LOCK WAIT:事務91322處於鎖等待狀態;其他字段解釋詳見上問
LOCK WAIT 5 lock struct(s), heap size 1136, 6 row lock(s), undo log entries 2
MySQL thread id 23, OS thread handle 22788, query id 3199 localhost 127.0.0.1 root updating
* 事務91322當前執行SQL語句
/* ApplicationName=DataGrip 2021.1.1 */ delete from dead_lock_test where v1 = 4 and v2 = 4
* 事務91322等待的鎖信息
------- TRX HAS BEEN WAITING 8 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X waiting
* 事務91322等待的記錄鎖(鎖對應記錄主鍵為5,被事務91327持有)
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000005; asc     ;;
 1: len 6; hex 0000000164bf; asc     d ;;
 2: len 7; hex 81000000b20110; asc        ;;
 3: len 4; hex 80000005; asc     ;;
 4: len 4; hex 80000005; asc     ;;

------------------
* 以下展示事務91322所持有的鎖以及嘗試獲取的鎖,首先是表意向鎖(IX鎖)
TABLE LOCK table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock mode IX
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X
* 記錄鎖(鎖對應記錄主鍵為1)
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 0000000164b3; asc     d ;;
 2: len 7; hex 81000000ad0110; asc        ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 4; hex 80000001; asc     ;;

* 記錄鎖(鎖對應記錄主鍵為2)
Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000002; asc     ;;
 1: len 6; hex 0000000164b4; asc     d ;;
 2: len 7; hex 82000000ad0110; asc        ;;
 3: len 4; hex 80000002; asc     ;;
 4: len 4; hex 80000002; asc     ;;

* 記錄鎖(鎖對應記錄主鍵為3)
Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000003; asc     ;;
 1: len 6; hex 0000000164b9; asc     d ;;
 2: len 7; hex 81000000b00110; asc        ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 4; hex 80000003; asc     ;;

* 記錄鎖:鎖定記錄(添加記錄時創建的鎖)
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X locks rec but not gap
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 0000000164ba; asc     d ;;
 2: len 7; hex 020000011a03cb; asc        ;;
 3: len 4; hex 80000004; asc     ;;
 4: len 4; hex 80000004; asc     ;;

* 間隙鎖:鎖定記錄(刪除記錄時創建的鎖,在RR模式下生效,主要解決幻讀)
* 需要注意,InnoDB的刪除記錄不是物理刪除,而是標記刪除(等待后續記錄覆蓋),因此可理解刪除類似於更新操作
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X locks gap before rec
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 0000000164ba; asc     d ;;
 2: len 7; hex 020000011a03cb; asc        ;;
 3: len 4; hex 80000004; asc     ;;
 4: len 4; hex 80000004; asc     ;;

* 事務91322嘗試獲取的鎖(被事務91327持有)
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X waiting
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000005; asc     ;;
 1: len 6; hex 0000000164bf; asc     d ;;
 2: len 7; hex 81000000b20110; asc        ;;
 3: len 4; hex 80000005; asc     ;;
 4: len 4; hex 80000005; asc     ;;

(1)由以上注釋可知,事務91322在嘗試刪除時,會對表中所有記錄添加記錄鎖。
這是因為當前刪除記錄條件為v1 = 4 and v2 = 4,在v1與v2字段上,並未建立相應的索引。
因為無法通過索引確定主鍵,導致MySQL Server會先嘗試鎖定當前dead_lock_test表中所有記錄添加記錄鎖(可以設置參數進行優化,根據where條件逐漸解除不滿足條件記錄上的記錄鎖)。

(2)事務91322嘗試對dead_lock_test表中所有記錄添加鎖,發現記錄(id=5)已經被事務91327添加記錄鎖,導致事務91322只能等待事務91327放棄記錄鎖。

3.3 stage3

執行delete from dead_lock_test where v1 = 5 and v2 = 5;后即可發現終端輸出:

[2021-05-13 15:33:29] [40001][1213] Deadlock found when trying to get lock; try restarting transaction

執行show engine innodb status;節選死鎖信息如下所示:

因為內容較多,不再列出解釋,詳見輸出信息中文注釋部分

------------------------
LATEST DETECTED DEADLOCK
------------------------
2021-05-13 17:27:09 0xca4
*** (1) TRANSACTION:
* 事務91322持有鎖情況,在stage2已經詳細解釋,此處不再贅述
TRANSACTION 91322, ACTIVE 78 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1136, 6 row lock(s), undo log entries 2
MySQL thread id 23, OS thread handle 22788, query id 3199 localhost 127.0.0.1 root updating
/* ApplicationName=DataGrip 2021.1.1 */ delete from dead_lock_test where v1 = 4 and v2 = 4

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 0000000164b3; asc     d ;;
 2: len 7; hex 81000000ad0110; asc        ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 4; hex 80000001; asc     ;;

Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000002; asc     ;;
 1: len 6; hex 0000000164b4; asc     d ;;
 2: len 7; hex 82000000ad0110; asc        ;;
 3: len 4; hex 80000002; asc     ;;
 4: len 4; hex 80000002; asc     ;;

Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000003; asc     ;;
 1: len 6; hex 0000000164b9; asc     d ;;
 2: len 7; hex 81000000b00110; asc        ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 4; hex 80000003; asc     ;;


*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91322 lock_mode X waiting
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000005; asc     ;;
 1: len 6; hex 0000000164bf; asc     d ;;
 2: len 7; hex 81000000b20110; asc        ;;
 3: len 4; hex 80000005; asc     ;;
 4: len 4; hex 80000005; asc     ;;


*** (2) TRANSACTION:
TRANSACTION 91327, ACTIVE 71 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 24, OS thread handle 15668, query id 3237 localhost 127.0.0.1 root updating
/* ApplicationName=DataGrip 2021.1.1 */ delete from dead_lock_test where v1 = 5 and v2 = 5

*** (2) HOLDS THE LOCK(S):
* 事務91327持有記錄(id=5)的記錄鎖,此鎖正在被事務91322等待持有
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91327 lock_mode X locks rec but not gap
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000005; asc     ;;
 1: len 6; hex 0000000164bf; asc     d ;;
 2: len 7; hex 81000000b20110; asc        ;;
 3: len 4; hex 80000005; asc     ;;
 4: len 4; hex 80000005; asc     ;;


*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
* 事務91327等待持有記錄鎖,鎖信息見后續解釋
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `igw_proxy_rule_management`.`dead_lock_test` trx id 91327 lock_mode X waiting
* 事務91327等待持有記錄(id=1)的記錄鎖(delete無法走索引查詢,因此會嘗試對所有表記錄進行加鎖,但是事務91322持有id=1/2/3/4的記錄鎖,死鎖條件構成)
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 0000000164b3; asc     d ;;
 2: len 7; hex 81000000ad0110; asc        ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 4; hex 80000001; asc     ;;

*** WE ROLL BACK TRANSACTION (2)

由上可知:
(1)事務91322執行刪除操作時,嘗試獲取表中所有記錄的記錄鎖,其中記錄(id=5)的鎖被事務91327持有;
(2)事務91327執行刪除操作時,嘗試獲取表中所有記錄的記錄鎖,發現記錄(id=1/2/3/4)的鎖被事務91322持有;
(3)至此,事務91322與事務91327構成互相等待,死鎖形成。

解決方案

4.1 添加索引

由上述的分析可知,刪除時因為where條件無法利用索引,導致MySQL會嘗試對表中所有記錄加記錄鎖,產生死鎖。
我們僅需在v1以及v2字段上建立聯合索引,縮小記錄沖突范圍。

create index dead_lock_test_v2_v1_index on dead_lock_test (v1, v2);

此處沒有設置唯一索引,如果多個事務根據索引查詢,鎖定的記錄存在重疊,也容易復現死鎖現象。
不過當前業務側的數據插入,可保證在短暫時間范圍內,不存在重疊記錄,且表中存在一些重復數據,因此不使用唯一索引。

4.2 最終

表中添加索引。事務中添加后再刪除,通過回滾實現。

參考文章

把MySQL中的各種鎖及其原理都畫出來
MySQL鎖系列(二)之 鎖解讀


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM