在平時的開發過程中,遇到一個網關服務請求掛起問題,以此作為切入點,簡單介紹一下Windbg這個調試工具,以及如何使用這個工具分析問題。
一、背景介紹
1、業務背景
最近在開發新的業務系統,采用微服務的框架,前后端分離;后端提供的SG服務,前端運用Vue開發頁面。后端的SG服務使用的是C#語音,數據庫,Redis和Sqlserver等。開發過程中,后端服務在VS中調試代碼,單元測試等都非常順利;部署到開發聯調環境前后端聯調也有序進行着。
2、技術架構
簡單繪制了一下現有微服務的技術架構圖。
3、遇到的問題
突然有一天,在開發聯調環境進行測試聯調時,發現每次調用后端的SG服務都超時,響應時長超過10s,服務也沒有拋出異常信息,非常影響開發效率,重啟SG 服務器沒有解決問題。
於是,考慮抓個dump進行分析一下超時的原因。接下來先給大家普及一下dump的分析工具,這個問題的分析思路,然后詳細說一下如何在dump文件中找到異常的具體過程。
二、 Windbg介紹
1、Windbg是個非常強大的調試器,它設計了極其豐富的功能來支持各種調試。針對幾個常用的應用場景進行了對比分析
支持的類型 |
說明 |
應用場景 |
用戶態調試 |
附加進程的方式調試。調試器與被調試程序建立聯系,程序向調試器發送暫停和恢復調試指令。 |
類似VS的單步調試,可以是設置斷點單步調制 |
內核態調試 |
用來在本地和遠程計算機調試內核 |
1、在系統啟動的早期階段或者系統關閉的后期階段當不存在交互式的控制台時調試進程; 2、分析進程間通信問題 |
轉儲文件調試
|
轉儲文件(Dump)是一個快照,它顯示正在執行的進程和某個時刻為應用程序加載的模塊。轉儲文件帶有堆信息的轉儲還包括應用程序在該點的內存快照。 |
1、性能分析,內存泄漏,線程阻塞, 2、故障異常排查 3、進程Crash分析等 |
遠程調試 |
通過調試服務器DbgSrv進行遠程調試 |
1、程序運行需要時候全屏 2、程序在客戶的機器上Crash崩潰 |
2、Windbg典型的窗口程序,但是它的大多數的調試功能,還是以輸入命令進行的,命令不區分大小寫。
命令分類 |
用途 |
數量 |
舉例 |
說明 |
標准命令 |
適用於各種調試目標的最基本調試功能:查看,結束,幫助等
|
20多個 |
觀察棧的命令K 顯示線程的命令~ 顯示進程的命令| 結束調試的命令q 顯示標准命令的? |
1、通常是一兩個字符或者符合,例外:version 2、部分命令代表一系列以這個字符開頭的雙字符命令
|
元命令 |
標准命令沒有提供的調試功能。 |
140多個 |
加載模塊 .loadby .load 顯示已加載模塊 .chain |
元命令是內置在調試器引擎或者程序文件中的,可以直接用的 都是一個點開始(.)也叫點命令 |
擴展命令 |
1、用於擴展某一方面的調試功能; 2、用戶也可以編寫自己的擴展模塊和命令
|
難以計數 |
查看線程 !threads 查看對象信息:!do
|
使用擴展命令時,命令以!開頭 完整的格式: ![擴展模塊名].<擴展命令名>[參數] 其中擴展模塊名可以省略 |
3、加載擴展的命令:
1)使用 .load 命令加上擴展模塊的完整路徑來加載它
2)使用.loadby命令加上擴展模塊的名稱,自動到擴展模塊路徑中搜索匹配的模塊
3)使用!擴展模塊名.擴展命令名的方式會自動搜索和加載指定的模塊
4、默認的符號設置:
srv*c:\symcache*http://msdl.microsoft.com/download/symbolsc:\symcache
三、網關服務請求掛起問題的分析過程
首先說明一下這個問題的根因:Redis hang住導致服務連接超時。正常重啟redis即可解決,為避免后續再發生類似的問題,修改redis默認設置問題。
• 問題具體分析的步驟:
1、打開dump文件,加載符號文件.loadby sos clr ; .load c:\mycache\mex.dll
0:000> .loadby sos clr 0:000> .load c:\symcache\mex.dll Mex External 3.0.0.7172 Loaded!
2、查看所有的線程!threads, 發現存在lock的46號線程。
0:000> !threads PDB symbol for clr.dll not loaded ThreadCount: 40 UnstartedThread: 0 BackgroundThread: 27 PendingThread: 0 DeadThread: 11 Hosted Runtime: no Lock ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception 14 1 121e4 000001d447d948e0 28220 Preemptive 000001D5C85CC5A8:000001D5C85CE008 000001d447d8a0f0 0 Ukn 32 2 107d4 000001d447efa1e0 2b220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 MTA (Finalizer) 34 3 11318 000001d8d3ace3e0 102a220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 MTA (Threadpool Worker) 35 4 134dc 000001d8d3adf920 21220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn 37 6 101d8 000001d8d3bc89b0 1020220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn (Threadpool Worker) 39 7 b808 000001d8d3c68870 202b220 Preemptive 000001D6C81F6560:000001D6C81F7FD0 000001d8d3ade8e0 1 MTA 40 8 102c0 000001d8d3c67840 8029220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 MTA (Threadpool Completion Port) 41 10 10994 000001d8d3c70ae0 8029220 Preemptive 000001D7481920A0:000001D748193FD0 000001d447d8a0f0 0 MTA (Threadpool Completion Port) 22 9 145a8 000001d8d3c70310 8028220 Preemptive 000001D4486DA060:000001D4486DBFD0 000001d447d8a0f0 0 MTA (Threadpool Completion Port) 42 11 1425c 000001d8d3c712b0 202b220 Preemptive 000001D5C86C91B0:000001D5C86C9FD0 000001d8d3ade8e0 0 MTA 44 12 11a5c 000001d8d4ec22c0 3029220 Preemptive 000001D548294B00:000001D548295FD0 000001d8d3ade8e0 0 MTA (Threadpool Worker) 45 13 7ca0 000001d8d4ec2a90 3029220 Preemptive 000001D6484E75D0:000001D6484E8FD0 000001d8d3ade8e0 0 MTA (Threadpool Worker) 46 14 13320 000001d8d4ec6ca0 3029220 Preemptive 000001D7C8454AE8:000001D7C8455FD0 000001d8d3ade8e0 3 MTA (Threadpool Worker) 47 15 11098 000001d8d4ec7470 1029220 Preemptive 000001D6C8201300:000001D6C8201FD0 000001d447d8a0f0 0 MTA (Threadpool Worker) 48 16 a670 000001d8d4eca730 202b220 Preemptive 000001D4C8365CF8:000001D4C8365FD0 000001d8d3ade8e0 0 MTA 49 17 12b2c 000001d8d4ed4ca0 202b220 Preemptive 000001D6484E9330:000001D6484EAFD0 000001d8d3ade8e0 0 MTA 50 18 11dc8 000001d8d4eec8e0 202b220 Preemptive 000001D4486D01C8:000001D4486D1FD0 000001d8d3ade8e0 0 MTA 51 19 13da0 000001d8d3e58f60 2b020 Preemptive 000001D6C81FC508:000001D6C81FDFD0 000001d8d3ade8e0 1 MTA 52 20 cd10 000001d8d3e5ed20 202b220 Preemptive 0000000000000000:0000000000000000 000001d8d3ade8e0 0 MTA 53 21 101e8 000001d8d3e59730 202b220 Preemptive 000001D4C85C4CB8:000001D4C85C5FD0 000001d8d3ade8e0 0 MTA 54 22 2d90 000001d8d3e5fcc0 202b220 Preemptive 000001D4486E6178:000001D4486E7FD0 000001d8d3ade8e0 0 MTA 55 23 13a74 000001d8d4f16d90 202b220 Preemptive 000001D7C8413638:000001D7C8413FD0 000001d8d3ade8e0 0 MTA 56 24 1364c 000001d8d4f15df0 2b020 Preemptive 000001D6484EF488:000001D6484F0FD0 000001d8d3ade8e0 1 MTA 57 25 13890 000001d8d4f165c0 202b220 Preemptive 000001D6C81FE0A0:000001D6C81FFFD0 000001d8d3ade8e0 0 MTA 58 26 119d8 000001d8d4f18500 1029220 Preemptive 000001D4486E9C58:000001D4486E9FD0 000001d447d8a0f0 0 MTA (Threadpool Worker) 59 27 14614 000001d4475e59f0 1029220 Preemptive 000001D5485ACAD8:000001D5485ADFD0 000001d447d8a0f0 0 MTA (Threadpool Worker) XXXX 28 0 000001d4475e61c0 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 29 0 000001d4475e2b10 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 30 0 000001d4475e1b70 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 31 0 000001d4475e6990 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 32 0 000001d4475e32e0 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 33 0 000001d4475e13a0 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 34 0 000001d4475e3ab0 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 35 0 000001d4475e2340 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 36 0 000001d4475e5220 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn XXXX 37 0 000001d4475e4280 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn 60 38 46f4 000001d4475e4a50 1029220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 MTA (Threadpool Worker) 61 39 13d24 000001d4475e8100 1029220 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 MTA (Threadpool Worker) XXXX 5 0 000001d8d4f93b70 839820 Preemptive 0000000000000000:0000000000000000 000001d447d8a0f0 0 Ukn 62 40 10980 000001d8d4f9a8d0 2b220 Preemptive 000001D5485D63B8:000001D5485D7FD0 000001d8d3ade8e0 0 MTA
3、查看線程的用戶態執行時間,這里也發現46號線程在上面
0:000> !runaway User Mode Time Thread Time 14:121e4 0 days 0:00:00.625 46:13320 0 days 0:00:00.156 42:1425c 0 days 0:00:00.062 53:101e8 0 days 0:00:00.046 48:a670 0 days 0:00:00.031 55:13a74 0 days 0:00:00.015 32:107d4 0 days 0:00:00.015 0:116dc 0 days 0:00:00.015 62:10980 0 days 0:00:00.000 61:13d24 0 days 0:00:00.000 60:46f4 0 days 0:00:00.000 59:14614 0 days 0:00:00.000 58:119d8 0 days 0:00:00.000 57:13890 0 days 0:00:00.000 56:1364c 0 days 0:00:00.000 54:2d90 0 days 0:00:00.000 52:cd10 0 days 0:00:00.000 51:13da0 0 days 0:00:00.000 50:11dc8 0 days 0:00:00.000 49:12b2c 0 days 0:00:00.000 47:11098 0 days 0:00:00.000 45:7ca0 0 days 0:00:00.000 44:11a5c 0 days 0:00:00.000 43:14518 0 days 0:00:00.000 41:10994 0 days 0:00:00.000 40:102c0 0 days 0:00:00.000 39:b808 0 days 0:00:00.000 38:fee4 0 days 0:00:00.000 37:101d8 0 days 0:00:00.000 36:334 0 days 0:00:00.000 35:134dc 0 days 0:00:00.000 34:11318 0 days 0:00:00.000 33:eaf0 0 days 0:00:00.000 31:1858 0 days 0:00:00.000 30:11a98 0 days 0:00:00.000 29:e30 0 days 0:00:00.000 28:10840 0 days 0:00:00.000 27:117c0 0 days 0:00:00.000 26:b0f8 0 days 0:00:00.000 25:baf0 0 days 0:00:00.000 24:ffb8 0 days 0:00:00.000 23:11c14 0 days 0:00:00.000 22:145a8 0 days 0:00:00.000 21:145d4 0 days 0:00:00.000 20:12660 0 days 0:00:00.000 19:d7f0 0 days 0:00:00.000 18:230c 0 days 0:00:00.000 17:12378 0 days 0:00:00.000 16:8134 0 days 0:00:00.000 15:12914 0 days 0:00:00.000 13:11fd8 0 days 0:00:00.000 12:13044 0 days 0:00:00.000 11:3ef8 0 days 0:00:00.000 10:1160c 0 days 0:00:00.000 9:13a5c 0 days 0:00:00.000 8:11924 0 days 0:00:00.000 7:12068 0 days 0:00:00.000 6:1386c 0 days 0:00:00.000 5:e73c 0 days 0:00:00.000 4:13b34 0 days 0:00:00.000 3:f4a0 0 days 0:00:00.000 2:10cdc 0 days 0:00:00.000 1:14688 0 days 0:00:00.000
4、切換到46號線程 ~46s
0:000> ~46s ntdll!NtWaitForMultipleObjects+0x14: 00007ff8`d8ac67c4 c3 ret
5、查看線程的調用棧信息 !clrstack,發現訪問redis獲取配置信息時存在異常,並重新連接調用信息
0:046> !clrstack PDB symbol for clr.dll not loaded OS Thread Id: 0x13320 (46) Child SP IP Call Site 0000002f001fb3a8 00007ff8d8ac67c4 [HelperMethodFrame_1OBJ: 0000002f001fb3a8] System.Threading.Thread.JoinInternal(Int32) 0000002f001fb4b0 00007ff86780b33a ServiceStack.Redis.RedisNativeClient.Connect() 0000002f001fb580 00007ff86780b1d4 ServiceStack.Redis.RedisNativeClient.TryConnectIfNeeded() 0000002f001fb5c0 00007ff86781590a ServiceStack.Redis.RedisNativeClient.SendReceive[[System.__Canon, mscorlib]](Byte[][], System.Func`1<System.__Canon>, System.Action`1<System.Func`1<System.__Canon>>, Boolean) 0000002f001fb620 00007ff867815cb4 ServiceStack.Redis.RedisNativeClient.SendExpectData(Byte[][]) 0000002f001fb690 00007ff867817a22 ServiceStack.Redis.RedisNativeClient.get_Info() 0000002f001fb700 00007ff86780ae9e ServiceStack.Redis.RedisClient.GetServerRole() 0000002f001fb730 00007ff867809ec3 ServiceStack.Redis.RedisResolver.CreateRedisClient(ServiceStack.Redis.RedisEndpoint, Boolean) 0000002f001fb7e0 00007ff867809bed ServiceStack.Redis.PooledRedisClientManager.CreateRedisClient() 0000002f001fb850 00007ff867805d2c ServiceStack.Redis.PooledRedisClientManager.GetClient() 0000002f001fb8f0 00007ff867663e75 ****.****.****.RedisPoolManager.GetClient(System.String) 0000002f001fb9a0 00007ff867663bc4 ****.****.****.CacheService.GetClient() 0000002f001fb9e0 00007ff867662ccd ****.****.****.ServiceConfigCacheService.GetAvailableConfigByCache(Boolean) 0000002f001fbc30 00007ff8676619a1 ****.****.****.ServiceDAC.GetAvailableConfig() 0000002f001fbc70 00007ff867661646 ****.****.****.ServiceConfigCache.GetConfigFromCache() 0000002f001fbcb0 00007ff86766157c ****.****.****.ServiceConfigCache..ctor() 0000002f001fbd10 00007ff86765e778 ****.****.****.ServiceConfigCache.get_Current() 0000002f001fbd80 00007ff86765fadb ****.****.****.TCPRounter.Load() 0000002f001fbee0 00007ff86765ebce ****.****.****.TCPRounter..ctor() 0000002f001fbfc0 00007ff86765d7dd ****.****.****.RounterService.GetService(****.****.****.RounterContext, ****.****.****.ProxyModel, System.Collections.Generic.List`1<System.String>) 0000002f001fc090 00007ff86765de1b ****.****.****.TCPProxy.GetService[[System.__Canon, mscorlib]](System.Collections.Generic.List`1<****.****.****.SPI.ServiceConfig>, Boolean, System.Collections.Generic.List`1<System.String>) …………
6、!dso 查看線程棧上的托管對象。發現存在redis連接異常,並嘗試連接的信息。
0:046> !dso OS Thread Id: 0x13320 (46) RSP/REG Object Name 0000002F001FAF50 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[] 0000002F001FB0D8 000001d5482a32c0 System.Web.AspNetSynchronizationContext 0000002F001FB1F8 000001d7c84548e0 System.Threading.ThreadStart 0000002F001FB248 000001d7c8454858 System.Threading.Thread 0000002F001FB290 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[] 0000002F001FB2C0 000001d7c8454858 System.Threading.Thread 0000002F001FB348 000001d7c8454818 System.Threading.ThreadStart 0000002F001FB368 000001d7c8454858 System.Threading.Thread 0000002F001FB3D8 000001d5482aa550 System.Security.Principal.GenericPrincipal 0000002F001FB3F0 000001d7c8454818 System.Threading.ThreadStart 0000002F001FB410 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[] 0000002F001FB420 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult 0000002F001FB428 000001d7c8452838 System.Random 0000002F001FB430 000001d54834dd30 System.Threading.ExecutionContext 0000002F001FB440 000001d7c8454858 System.Threading.Thread 0000002F001FB460 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult 0000002F001FB488 000001d5482aa550 System.Security.Principal.GenericPrincipal 0000002F001FB490 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult 0000002F001FB498 000001d7c8452838 System.Random 0000002F001FB4B0 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]] 0000002F001FB4C0 000001d7c8454818 System.Threading.ThreadStart 0000002F001FB4C8 000001d7c8454858 System.Threading.Thread 0000002F001FB550 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]] 0000002F001FB558 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB560 000001d7c8453cd8 System.Byte[][] 0000002F001FB568 000001d7c8452838 System.Random 0000002F001FB580 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB5A8 000001d7c8454618 ServiceStack.Redis.RedisRetryableException 0000002F001FB5B0 000001d7c8453cd8 System.Byte[][] 0000002F001FB5D0 000001d7c8454618 ServiceStack.Redis.RedisRetryableException 0000002F001FB5D8 000001d7c8453cd8 System.Byte[][] 0000002F001FB5E8 000001d7c8454148 ServiceStack.Redis.RedisRetryableException 0000002F001FB600 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB608 000001d7c8453cd8 System.Byte[][] 0000002F001FB620 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB628 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB630 000001d7c8453cd8 System.Byte[][] 0000002F001FB638 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]] 0000002F001FB650 000001d7c8452838 System.Random 0000002F001FB658 000001d7c8453cf8 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.String, mscorlib]] 0000002F001FB668 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB670 000001d7c8453cd8 System.Byte[][] 0000002F001FB678 000001d7c8452838 System.Random 0000002F001FB680 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult 0000002F001FB690 000001d7c8453cd8 System.Byte[][] 0000002F001FB6A0 000001d5485ae110 System.Byte[] 0000002F001FB6B0 000001d7c8452838 System.Random 0000002F001FB6B8 000001d54858ec78 ServiceStack.Redis.RedisResolver 0000002F001FB6C8 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB6E0 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[] 0000002F001FB6E8 000001d7c8452838 System.Random 0000002F001FB6F0 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult 0000002F001FB700 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint 0000002F001FB718 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint 0000002F001FB720 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint 0000002F001FB788 000001d7c8453140 ServiceStack.Redis.RedisClient 0000002F001FB790 000001d7c8452950 ServiceStack.Redis.RedisClient 0000002F001FB7B0 000001d54858ec78 ServiceStack.Redis.RedisResolver 0000002F001FB7B8 000001d54858e7d8 ServiceStack.Redis.PooledRedisClientManager 0000002F001FB7C0 000001d7c8452818 ServiceStack.Redis.PooledRedisClientManager+<>c__DisplayClass77_0 0000002F001FB7C8 000001d7c8452838 System.Random 0000002F001FB7E0 000001d54858ec78 ServiceStack.Redis.RedisResolver 0000002F001FB810 000001d54858ebe8 System.Object 0000002F001FB828 000001d54858ebc8 System.Collections.Concurrent.ConcurrentStack`1[[ServiceStack.Redis.RedisClient, ServiceStack.Redis]] 0000002F001FB838 000001d5482fd1d0 System.Web.Http.ExceptionHandling.CompositeExceptionLogger 0000002F001FB868 000001d548334ad8 System.Object[] (System.Object[]) 0000002F001FB870 000001d7c84527f0 System.Diagnostics.Stopwatch 0000002F001FB8C8 000001d548334ad8 System.Object[] (System.Object[]) 0000002F001FB8D0 000001d7c84527f0 System.Diagnostics.Stopwatch
………………
7、使用命令!mex.do2查看redis的訪問信息 ,連接的是本服務器上的redis
0:046> !mex.do2 000001d7c8452950 0x000001d7c8452950 ServiceStack.Redis.RedisClient [statics] 0000 endData : 000001d7c8452ad0 (System.Byte[]) [Length: 2] 0008 lastCommand : NULL 0010 lastSocketException : NULL 0018 socket : NULL 0020 Bstream : NULL 0028 sslStream : NULL 0030 transaction : 000001d7c8452988 (System.Void) 0038 pipeline : NULL 0040 <ClientManager>k__BackingField : NULL 0048 <Host>k__BackingField : 000001d5485a6c20 "127.0.0.1" [9] (System.String) 0050 <NamespacePrefix>k__BackingField : NULL 0058 <Password>k__BackingField : 000001d5485a6b60 "***********" [16] (System.String) 0060 <Client>k__BackingField : NULL 0068 <ConnectionFilter>k__BackingField : NULL 0070 <SendCmdFilter>k__BackingField : NULL 0078 cmdBuffer : 000001d7c8452af0 (System.Collections.Generic.List<System.ArraySegment<System.Byte>>) [Length: 0] 0080 currentBuffer : 000001d7c8452b18 (System.Byte[]) [Length: 1450] 0088 <OnBeforeFlush>k__BackingField : NULL 0090 deactivatedAtTicks : 0 (System.Int64) 0098 LastConnectedAtTimestamp : 0 (System.Int64) 00a0 <Id>k__BackingField : 0 (System.Int64) 00a8 db : 0 (System.Int64) 00b0 clientPort : 0 (System.Int32) 00b4 active : 0 (System.Int32) 00b8 <Port>k__BackingField : 6379 (System.Int32) 00bc <ConnectTimeout>k__BackingField : -1 (System.Int32) 00c0 <RetryCount>k__BackingField : 0 (System.Int32) 00c4 <SendTimeout>k__BackingField : -1 (System.Int32) 00c8 <ReceiveTimeout>k__BackingField : -1 (System.Int32) 00cc <IdleTimeOutSecs>k__BackingField : 240 (System.Int32) 00d0 currentBufferIndex : 0 (System.Int32) 00d4 <Ssl>k__BackingField : False (System.Boolean) 00d5 <IsDisposed>k__BackingField : False (System.Boolean) 00d8 <SslProtocols>k__BackingField : 000001d7c8452a30 (System.Nullable<System.Security.Authentication.SslProtocols>) 00e0 retryTimeout : 000001d7c8452a38 00:00:03 (System.TimeSpan) 00e8 <Name>k__BackingField : NULL 00f0 OnDispose : NULL 00f8 registeredTypeIdsWithinPipelineMap : 000001d7c8452a80 (System.Collections.Generic.Dictionary<System.String,System.Collections.Generic.HashSet<System.String>>) 0100 <Hashes>k__BackingField : 000001d7c8453128 (ServiceStack.Redis.RedisClient+RedisClientHashes) 0108 <Lists>k__BackingField : 000001d7c84530e0 (ServiceStack.Redis.RedisClient+RedisClientLists) 0110 <Sets>k__BackingField : 000001d7c84530f8 (ServiceStack.Redis.RedisClient+RedisClientSets) 0118 <SortedSets>k__BackingField : 000001d7c8453110 (ServiceStack.Redis.RedisClient+RedisClientSortedSets)
8、使用命令!mex.do2查看連接redis拋出的異常, 提示連接redis異常
0:046> !mex.do2 000001d7c8454618 0x000001d7c8454618 ServiceStack.Redis.RedisRetryableException 0000 _className : NULL 0008 _exceptionMethod : NULL 0010 _exceptionMethodString : NULL 0018 _message : 000001d5485afca8 "Socket is not connected" [23] (System.String) 0020 _data : NULL 0028 _innerException : NULL 0030 _helpURL : NULL 0038 _stackTrace : 000001d7c84546f8 (System.SByte[]) [Length: 48] 0040 _watsonBuckets : NULL 0048 _stackTraceString : NULL 0050 _remoteStackTraceString : NULL 0058 _dynamicMethods : NULL 0060 _source : NULL 0068 _safeSerializationManager : 000001d7c84546c0 (System.Runtime.Serialization.SafeSerializationManager) 0070 _xptrs : 0000000000000000 (System.IntPtr) 0078 _ipForWatsonBuckets : 00007ff867815a33 (System.UIntPtr) 0080 _remoteStackIndex : 0 (System.Int32) 0084 _HResult : -2146233088 (System.Int32) 0088 _xcode : -532462766 (System.Int32) 0090 <Code>k__BackingField : NULL
9、問題根因:檢查Redis運行窗口,發現redis沒有正常啟動,重新啟動redis,服務訪問恢復正常
檢測redis 窗口時,發現Redis為hang住的狀態,如下圖所示:
重啟Redis,下圖是正常運行的效果:
確定到問題根因: redis屬性中【快速編輯模型模式】屬性設置為勾選的情況下,啟動過程中,點擊頁面中的位置,會中止啟動,此時redis為不可用狀態。
四、問題分析方法總結:
1、找和問題相關的因素(鎖,運行時間等)。
2、定位疑似存在問題的線程,查看線程的堆棧。
3、查看堆棧上的托管對象,看到明細的異常信息。
4、明確問題根因 。
5、修復驗證。
以上是這個網關服務請求掛起問題的總結。借助WindDbg調試工具,分析Dump文件,通過分析業務邏輯內部的執行情況,找到問題的根因。把這個工具和分析過程分享給大家,希望對大家有所幫助。
張寧濤
2022/05/05 更新