MySQL 死鎖問題排查


1.監控日志

通過監控發現如下異常,尾隨其后的還有報錯相應的堆棧信息,指出了具體是哪個SQL語句發生了死鎖

com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at com.***.***.im.service.platform.dao.impl.ImMessageDaoImpl.insert(ImMessageDaoImpl.java:50)
at com.***.***.im.service.platform.service.impl.ImMessageServiceImpl.saveNewSessionMessage(ImMessageServiceImpl.java:543)

通過日志查看代碼,覺得不大可能是同一個事務並發執行導致的死鎖

2.查看隔離級別

select @@tx_isolation;  //當前session隔離級別
select @@global.tx_isolation;  //全局回話隔離級別

業務代碼有可能使用默認的隔離級別,默認的級別就是全局的隔離級別;業務也可能設置了當前事物的隔離級別,我們使用的默認級別,是RR(可重復讀)

3.查看最近一次innoDB監測的死鎖

聯系DBA,查看發生死鎖的業務對應的數據庫,和innodb記錄的死鎖日志

show engine innodb status;

查詢得到最近的一次死鎖日志為:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-04-01 23:32:49 0x7f6306adb700
*** (1) TRANSACTION:
TRANSACTION 23734694036, ACTIVE 1 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1136, 25 row lock(s)
MySQL thread id 7109502, OS thread handle 140046693021440, query id 5270358204 172.31.21.66 im_w1 updating

update im_servicer_session
		set unread_count=0
		where session_id=142298 and servicer_id=8708

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 5351 page no 18 n bits 224 index PRIMARY of table `im`.`im_servicer_session` trx id 23734694036 
lock_mode X locks rec but not gap waiting
Record lock, heap no 148 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 00000000000006a4; asc         ;;
 1: len 6; hex 000586b2b07f; asc       ;;
 2: len 7; hex 27000002141d37; asc '     7;;
 3: len 8; hex 0000000000022bda; asc       + ;;
 4: len 8; hex 0000000000002204; asc       " ;;
 5: len 1; hex 00; asc  ;;
 6: len 5; hex 9943c20000; asc  C   ;;
 7: len 1; hex 00; asc  ;;
 8: len 4; hex 00000003; asc     ;;
 9: len 5; hex 99a2c37642; asc    vB;;
 10: len 5; hex 99a2c37830; asc    x0;;

*** (2) TRANSACTION:
TRANSACTION 23734694015, ACTIVE 1 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2
MySQL thread id 7108183, OS thread handle 140063290537728, query id 5270358482 172.31.35.143 im_w1 update

insert into im_message_0_34
         ( chat_id,
            message_type,
            message,
            house_id,
            send_time,
            send_status,
            receive_status,
            show_type ) 
         values ( '4NzP0DZO7wngS5YiGFcJTKu0L2Xrhan7zpbBBO/1KdQ=',
            0,
            '嗯嗯',
            106874,
            '2019-04-01 23:32:48.113',
            0,
            1,
            0 )

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 5351 page no 18 n bits 224 index PRIMARY of table `im`.`im_servicer_session` trx id 23734694015 
lock_mode X locks rec but not gap
Record lock, heap no 148 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 00000000000006a4; asc         ;;
 1: len 6; hex 000586b2b07f; asc       ;;
 2: len 7; hex 27000002141d37; asc '     7;;
 3: len 8; hex 0000000000022bda; asc       + ;;
 4: len 8; hex 0000000000002204; asc       " ;;
 5: len 1; hex 00; asc  ;;
 6: len 5; hex 9943c20000; asc  C   ;;
 7: len 1; hex 00; asc  ;;
 8: len 4; hex 00000003; asc     ;;
 9: len 5; hex 99a2c37642; asc    vB;;
 10: len 5; hex 99a2c37830; asc    x0;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 5388 page no 1531 n bits 264 index idx_chat_id of table `im`.`im_message_0_34` trx id 23734694015 
lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 110 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 30; hex 344f69384254415559786c496a483947657577705071365a3764794f546e; asc 4Oi8BTAUYxlIjH9GeuwpPq6Z7dyOTn; (total 44 bytes);
 1: len 8; hex 00000000000069a0; asc       i ;;

*** WE ROLL BACK TRANSACTION (2)

從日志中可以看到只是簡單的記錄排它鎖(X lock),並非間隙鎖(gap lock)。還能發現第一個事務阻塞在了更新會話的SQL語句中,經查詢得到是更新消息為已讀的SQL,第二個事務阻塞在了保存消息的SQL語句中,死鎖發生的兩個事務的代碼分別如下:
TRANSACTION 23734694036

 //更新會話時間
imServicerSessionService.updateSessionTime(sessionVo.getSessionId(), EnumServicerSessionState.IN_SESSION);
//...時間較長的請求
if (md.getMessageId() != null && md.getMessageId() > 0) {
    logger.info("修改消息");
    imMessageDao.update(md);
}else{
    imMessageDao.insert(md);
}

TRANSACTION 23734694015

if (LoginUserUtil.isServicer()) {
    imMessageDao.markServicerMessageRead(chatId,baseSubTable.getTableName(),houseId, loginInfo.getAccountId());
    imServicerSessionService.resetUnreadCount(imSessionVoList.get(0).getSessionId(), loginInfo.getAccountId());
}

4.會話過程

5.解決辦法

  1. 解決死鎖可以從死鎖發生的條件入手,最容易解決的就是更改獲取資源的順序,在這個案例中可以更改的是事務TRANSACTION 23734694015里面兩個SQL執行的順序,因為他們沒有依賴關系
  2. 其次是避免長事務,讓事務執行的時間盡可能少,讓事務的覆蓋范圍盡可能小,長事務會導致並發度降低,且會有更多的SQL查詢延遲
  3. 給整個方法加事務是否是必須的?可以不加事務的盡量不加


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM