最近換公司了,新公司項目技術是用dotnetcore + efcore 做業務層的數據查詢。最近早會發言時候,測試提出有個頁面查詢列表速度特別慢,有幾十秒之長。我聽着有點詫異,畢竟之前在上一家公司我們接口查詢速度必須在200ms以內,有1秒的已經很誇張了,幾十秒基本沒有用戶體驗。但是這畢竟不是互聯網公司可能要求也沒這么嚴格。所以會后我看了下lambda的查詢邏輯。如下
var doctor = await MyUtils.CurrentEmpl(_dbContext, doctorId); var query = (await _dbContext.MedicalRecord .Include(x => x.TemplateConfig) .ThenInclude(x => x.TemplateDeptAuths) .ThenInclude(x => x.TemplateRoleAuths) .ToListAsync()) .Where(x => x.InpatId == inpatId && x.TemplateConfig != null && x.TemplateConfig.TemplateDeptAuths .Any(da => da.DeptId == doctor.DeptId && da.IsAuth && da.TemplateRoleAuths != null && da.TemplateRoleAuths.Any(ra => ra.RoleId == doctor.EmplTypeId && ra.AccessPermission > 0) ) ).AsQueryable();
經過解析log發現這段lambda的查詢完全沒有走where條件的索引,因為linq to ef在ToList()之后就會立即生成sql去數據庫查詢然后把數據帶回內存,而這個多表關聯直接帶回所有表數據而且有的表是一些大表所以查詢速度非常緩慢,經常需要十幾秒才能返回數據。我的優化步驟如下
第一步我首先讓lambda表達式不執行ToList(),並且刪除了一些不必要的null條件查看sql的log發現where條件帶上后執行速度快了很多可以在數秒內返回,代碼如下
var query = _dbContext.MedicalRecord .Include(x => x.TemplateConfig) .ThenInclude(x => x.TemplateDeptAuths) .ThenInclude(x => x.TemplateRoleAuths) .Where(x => x.InpatId == inpatId && x.TemplateConfig.TemplateDeptAuths.Any(da => da.DeptId == doctor.DeptId && da.IsAuth && da.TemplateRoleAuths.Any(ra => ra.RoleId == doctor.EmplTypeId && ra.AccessPermission > 0) ) );
但是這個查詢速度也是遠遠不夠,這才4張表關聯做了索引不至於這么慢,於是我排查ef生成sql的log發現sql如下
SELECT "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".id, "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".access_permission, "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".dept_auth_id, "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".oper_id, "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".oper_time, "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".role_id FROM medical_document.template_role_auth AS "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths" INNER JOIN ( SELECT DISTINCT "x.TemplateConfig.TemplateDeptAuths0".id, t0.id AS id0 FROM medical_document.template_dept_auth AS "x.TemplateConfig.TemplateDeptAuths0" INNER JOIN ( SELECT DISTINCT "x.TemplateConfig1".id FROM medical_document.medical_record AS x1 LEFT JOIN medical_document.template_config AS "x.TemplateConfig1" ON x1.template_id = "x.TemplateConfig1".id WHERE (x1.inpat_id = @__inpatId_0) AND EXISTS ( SELECT 1 FROM medical_document.template_dept_auth AS da1 WHERE (((da1.dept_id = @__doctor_DeptId_1) AND (da1.is_auth = TRUE)) AND EXISTS ( SELECT 1 FROM medical_document.template_role_auth AS ra1 WHERE ((ra1.role_id = @__doctor_EmplTypeId_2) AND (ra1.access_permission > 0)) AND (da1.id = ra1.dept_auth_id))) AND ("x.TemplateConfig1".id = da1.template_config_id)) ) AS t0 ON "x.TemplateConfig.TemplateDeptAuths0".template_config_id = t0.id ) AS t1 ON "x.TemplateConfig.TemplateDeptAuths.TemplateRoleAuths".dept_auth_id = t1.id ORDER BY t1.id0, t1.id
排查發現表關聯邏輯混亂,一些表不僅在where之后再次關聯 ,還多次被關聯重新掃全表比如 template_dept_auth和 template_dept_auth表,怎么看這種垃圾sql是完全不可能用的,因為查詢可讀性和效率必定是非常低的。不得已我只好重新編寫sql查詢,進行sql查詢測試后查詢效率基本在35ms可以返回我需要的數據
SELECT DISTINCT mr.* FROM medical_document.template_role_auth AS tra INNER JOIN medical_document.template_dept_auth AS tda ON tra.dept_auth_id = tda.id INNER JOIN medical_document.template_config AS tc ON tda.template_config_id = tc.id INNER JOIN medical_document.medical_record AS mr ON tc.id = mr.template_id WHERE mr.inpat_id = 1122334455 AND mr.template_id IS NOT NULL AND tda.dept_id = 316 AND tda.is_auth = TRUE AND tra.role_id = 397 AND tra.access_permission > 0
於是我重構linq查詢,抽出關聯條件后,linq查詢如下
var query = from tra in _dbContext.TemplateRoleAuth join tda in _dbContext.TemplateDeptAuth on tra.DeptAuthId equals tda.Id join tc in _dbContext.TemplateConfig on tda.TemplateConfigId equals tc.Id join mr in _dbContext.MedicalRecord on tc.Id equals mr.TemplateId where mr.InpatId == inpatId && mr.TemplateId != null && tda.DeptId == doctor.DeptId && tda.IsAuth && tra.RoleId == doctor.EmplTypeId && tra.AccessPermission > 0 select mr;
var result = await query.Distinct().ToListAsync();
運行后發現所有傳參基本可以穩定在200ms內返回,拉去ef 生成的sql日志發現sql與我預期的基本一致,生成sql如下
SELECT DISTINCT mr.id, mr.create_doctor_id, mr.create_doctor_name, mr.create_time, mr.designated_doctor_id, mr.has_comments, mr.inpat_id, mr.need_sign, mr.record_content, mr.record_name, mr.record_parent_type_id, mr.record_parent_type_name, mr.record_type_id, mr.sign_level, mr.signed1_doctor_id, mr.signed1_doctor_name, mr.signed1_time, mr.signed2_doctor_id, mr.signed2_doctor_name, mr.signed2_time, mr.signed3_doctor_id, mr.signed3_doctor_name, mr.signed3_time, mr.spell, mr.template_id, mr.unsigned_doctor_id, mr.unsigned_doctor_name, mr.unsigned_time, mr.wb, mr.write_doctor_id, mr.write_doctor_name, mr.write_time FROM medical_document.template_role_auth AS tra INNER JOIN medical_document.template_dept_auth AS tda ON tra.dept_auth_id = tda.id INNER JOIN medical_document.template_config AS tc ON tda.template_config_id = tc.id INNER JOIN medical_document.medical_record AS mr ON tc.id = mr.template_id WHERE (((((mr.inpat_id = @__inpatId_0) AND mr.template_id IS NOT NULL) AND (tda.dept_id = @__doctor_DeptId_1)) AND (tda.is_auth = TRUE)) AND (tra.role_id = @__doctor_EmplTypeId_2)) AND (tra.access_permission > 0)
至此本次調優完成,耗時從18s左右優化到200ms返回。簡單做個優化總結:
- 在表達式沒有完全寫完的情況下,不能使用ToList來加載數據,否則容易導致許多表還沒關聯篩選就被全表查詢導入內存
- 在查詢的時候不要對where條件的參數進行插表查詢,再次關聯表時候又掃一次表性能很低
- 所有關聯的表信息盡量在where條件前進行關聯,where條件后進行篩選
- 建立關鍵的索引能夠大幅度提升查詢效率
最后希望大家在用Orm順手的情況下,還是需要先些sql來進行查詢優化,然后再用orm進行查詢,並且比對生成sql結果來比較差異,減少查詢性能損。基礎很重要,框架只是包裝工具。最近看到園里有一篇文章《停止學習框架》寫的不錯,大家可以看下。