一、准備工作
1、數據庫模型:
如你所見,EF模型是上圖中三個表,第四個則是數據庫視圖。
2、數據:
先在HeadAddress表中插入三條數據,再在EndAddress表中也插入三條數據,最后往Customer表中插入三萬條隨機數據作為測試數據。
二、效率比較
1、視圖 vs 跨表:遍歷所有用戶信息(HeadAddress、EndAddress、Customer中的字段)
1 // 視圖(ToList) 2 var temp = _DataContext.CustomerView; 3 foreach (var item in temp) ; 4 // 跨表(ToList) 5 var temp = _DataContext.CustomerSet.Select(c => new 6 { 7 Name = c.Name, 8 Sex = c.Sex, 9 Street = c.EndAddress.Street, 10 Number = c.EndAddress.Number, 11 Province = c.EndAddress.HeadAddress.Province, 12 City = c.EndAddress.HeadAddress.City, 13 County = c.EndAddress.HeadAddress.County 14 }); 15 foreach (var item in temp) ;
對應的SQL:
1 SELECT 2 [Extent1].[Name] AS [Name], 3 [Extent1].[Sex] AS [Sex], 4 [Extent1].[Province] AS [Province], 5 [Extent1].[City] AS [City], 6 [Extent1].[County] AS [County], 7 [Extent1].[Street] AS [Street], 8 [Extent1].[Number] AS [Number] 9 FROM [dbo].[CustomerView] AS [Extent1]
1 SELECT 2 [Extent1].[EndAddressId] AS [EndAddressId], 3 [Extent1].[Name] AS [Name], 4 [Extent1].[Sex] AS [Sex], 5 [Extent2].[Street] AS [Street], 6 [Extent2].[Number] AS [Number], 7 [Extent3].[Province] AS [Province], 8 [Extent3].[City] AS [City], 9 [Extent3].[County] AS [County] 10 FROM [dbo].[CustomerSet] AS [Extent1] 11 INNER JOIN [dbo].[EndAddressSet] AS [Extent2] ON [Extent1].[EndAddressId] = [Extent2].[Id] 12 INNER JOIN [dbo].[HeadAddressSet] AS [Extent3] ON [Extent2].[HeadAddressId] = [Extent3].[Id]
結果:
在接下來的所有統計中,我都沒有把第1次(即上圖中的0次)的時間算在平均時間內(因為EF第一次訪問有初始時間)。可見使用視圖做遍歷效果並沒有提升,但是當我把測試代碼改為:
1 // 視圖(ToList) 2 var temp = _DataContext.CustomerView.ToList(); 3 foreach (var item in temp) ; 4 // 跨表(ToList) 5 var temp = _DataContext.CustomerSet.Select(c => new 6 { 7 Name = c.Name, 8 Sex = c.Sex, 9 Street = c.EndAddress.Street, 10 Number = c.EndAddress.Number, 11 Province = c.EndAddress.HeadAddress.Province, 12 City = c.EndAddress.HeadAddress.City, 13 County = c.EndAddress.HeadAddress.County 14 }).ToList();
時,我發現效率發生了明顯改變:
我們看到,視圖ToList所用時間與上次相比幾乎一致,甚至還有縮短,而使用跨表查找然后ToList耗時大大增加。至於原因,我認為可能是視圖的ToList結果在數據結構內部為我們節省了非常多的工作。
2、視圖 vs 跨表:遍歷省份是“湖北”的用戶信息(HeadAddress、EndAddress、Customer中的字段)
1 // 視圖 2 var temp = _DataContext.CustomerView.Where(c => c.Province == "湖北"); 3 foreach (var item in temp) ; 4 // 跨表 5 var temp = _DataContext.CustomerSet.Where(c => c.EndAddress.HeadAddress.Province == "湖北") 6 .Select(c => new 7 { 8 Name = c.Name, 9 Sex = c.Sex, 10 Street = c.EndAddress.Street, 11 Number = c.EndAddress.Number, 12 Province = c.EndAddress.HeadAddress.Province, 13 City = c.EndAddress.HeadAddress.City, 14 County = c.EndAddress.HeadAddress.County 15 }); 16 foreach (var item in temp) ;
對應的SQL與上面的非常相似,就是在最后多了一個Where語句,結果:
我們發現兩者時間消耗基本一致,同樣如果改為使用ToList的話,使用視圖會比跨表查詢快5ms左右。
3、Foreach vs Linq:測試把所有不為性別空的數據輸出為List<T>,source是用戶信息集
1 var source = _DataContext.CustomerSet; 2 // foreach 3 List<Customer> temp = new List<Customer>(); 4 foreach (var item in source) 5 if (item.Sex != null) 6 temp.Add(item); 7 // Linq 8 source.Where(c => c.Sex != null).ToList();
它們執行的SQL語句:
1 SELECT 2 [Extent1].[Id] AS [Id], 3 [Extent1].[Name] AS [Name], 4 [Extent1].[Sex] AS [Sex], 5 [Extent1].[EndAddressId] AS [EndAddressId] 6 FROM [dbo].[CustomerSet] AS [Extent1]
1 SELECT 2 [Extent1].[Id] AS [Id], 3 [Extent1].[Name] AS [Name], 4 [Extent1].[Sex] AS [Sex], 5 [Extent1].[EndAddressId] AS [EndAddressId] 6 FROM [dbo].[CustomerSet] AS [Extent1] 7 WHERE [Extent1].[Sex] IS NOT NULL
使用Foreach的時候是全部查詢出來,然后進行篩選,而使用Linq的Where則是先篩選了Sex Not Null的數據,再組成List。我想大家都能猜到結果了,沒錯,Linq大大領先傳統的foreach:
這也驗證了一點,ToList()效率確實非常地高!
4、SelectMany vs Select New:使用SelectMany和new生成兩次分組的數據,source是所有的用戶信息,生成的分組數據是先按姓別分組,再按省份分組的數據集,要求保存兩次分組的依據
1 var source = _DataContext.CustomerView; 2 // SelectMany 3 var result = source 4 .GroupBy(c => c.Sex) 5 .SelectMany(c => c 6 .GroupBy(a => a.Province) 7 .GroupBy(a => c.Key)); 8 foreach (var items in result) 9 foreach (var item in items) 10 ; 11 // Select New 12 var result = source 13 .GroupBy(c => c.Sex) 14 .Select(c => new { 15 Key = c.Key, 16 Value = c.GroupBy(a => a.Province) 17 }); 18 foreach (var items in result) 19 foreach (var item in items) 20 ;
使用SelectMany得到的結果數據類型:
使用Select New得到的結果數據類型:
SelectMany執行的SQL:
1 SELECT 2 [Project6].[Sex] AS [Sex], 3 [Project6].[C2] AS [C1], 4 [Project6].[C1] AS [C2], 5 [Project6].[C4] AS [C3], 6 [Project6].[Province] AS [Province], 7 [Project6].[C3] AS [C4], 8 [Project6].[Name] AS [Name], 9 [Project6].[Sex1] AS [Sex1], 10 [Project6].[Province1] AS [Province1], 11 [Project6].[City] AS [City], 12 [Project6].[County] AS [County], 13 [Project6].[Street] AS [Street], 14 [Project6].[Number] AS [Number] 15 FROM ( SELECT 16 [Project4].[C1] AS [C1], 17 [Project4].[Sex] AS [Sex], 18 [Project4].[C2] AS [C2], 19 [Filter3].[Province1] AS [Province], 20 [Filter3].[Name] AS [Name], 21 [Filter3].[Sex] AS [Sex1], 22 [Filter3].[Province2] AS [Province1], 23 [Filter3].[City] AS [City], 24 [Filter3].[County] AS [County], 25 [Filter3].[Street] AS [Street], 26 [Filter3].[Number] AS [Number], 27 CASE WHEN ([Filter3].[Province1] IS NULL) THEN CAST(NULL AS int) WHEN ([Filter3].[Name] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C3], 28 CASE WHEN ([Filter3].[Province1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C4] 29 FROM (SELECT 30 [Distinct3].[C1] AS [C1], 31 [Distinct1].[Sex] AS [Sex], 32 1 AS [C2] 33 FROM (SELECT DISTINCT 34 [Extent1].[Sex] AS [Sex] 35 FROM [dbo].[CustomerView] AS [Extent1] ) AS [Distinct1] 36 CROSS APPLY (SELECT DISTINCT 37 [Distinct1].[Sex] AS [C1] 38 FROM ( SELECT DISTINCT 39 [Extent2].[Province] AS [Province] 40 FROM [dbo].[CustomerView] AS [Extent2] 41 WHERE ([Distinct1].[Sex] = [Extent2].[Sex]) OR (([Distinct1].[Sex] IS NULL) AND ([Extent2].[Sex] IS NULL)) 42 ) AS [Distinct2] ) AS [Distinct3] ) AS [Project4] 43 OUTER APPLY (SELECT [Distinct4].[Province] AS [Province1], [Extent4].[Name] AS [Name], [Extent4].[Sex] AS [Sex], [Extent4].[Province] AS [Province2], [Extent4].[City] AS [City], [Extent4].[County] AS [County], [Extent4].[Street] AS [Street], [Extent4].[Number] AS [Number] 44 FROM (SELECT DISTINCT 45 [Extent3].[Province] AS [Province] 46 FROM [dbo].[CustomerView] AS [Extent3] 47 WHERE ([Project4].[Sex] = [Extent3].[Sex]) OR (([Project4].[Sex] IS NULL) AND ([Extent3].[Sex] IS NULL)) ) AS [Distinct4] 48 LEFT OUTER JOIN [dbo].[CustomerView] AS [Extent4] ON (([Project4].[Sex] = [Extent4].[Sex]) OR (([Project4].[Sex] IS NULL) AND ([Extent4].[Sex] IS NULL))) AND ([Distinct4].[Province] = [Extent4].[Province]) 49 WHERE ([Project4].[C1] = [Project4].[Sex]) OR (([Project4].[C1] IS NULL) AND ([Project4].[Sex] IS NULL)) ) AS [Filter3] 50 ) AS [Project6] 51 ORDER BY [Project6].[Sex] ASC, [Project6].[C1] ASC, [Project6].[C4] ASC, [Project6].[Province] ASC, [Project6].[C3] ASC
Select New執行的SQL:
1 SELECT 2 [Project4].[C1] AS [C1], 3 [Project4].[Sex] AS [Sex], 4 [Project4].[C3] AS [C2], 5 [Project4].[Province] AS [Province], 6 [Project4].[C2] AS [C3], 7 [Project4].[Name] AS [Name], 8 [Project4].[Sex1] AS [Sex1], 9 [Project4].[Province1] AS [Province1], 10 [Project4].[City] AS [City], 11 [Project4].[County] AS [County], 12 [Project4].[Street] AS [Street], 13 [Project4].[Number] AS [Number] 14 FROM ( SELECT 15 [Project2].[Sex] AS [Sex], 16 [Project2].[C1] AS [C1], 17 [Join1].[Province1] AS [Province], 18 [Join1].[Name] AS [Name], 19 [Join1].[Sex] AS [Sex1], 20 [Join1].[Province2] AS [Province1], 21 [Join1].[City] AS [City], 22 [Join1].[County] AS [County], 23 [Join1].[Street] AS [Street], 24 [Join1].[Number] AS [Number], 25 CASE WHEN ([Join1].[Province1] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[Name] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2], 26 CASE WHEN ([Join1].[Province1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C3] 27 FROM (SELECT 28 [Distinct1].[Sex] AS [Sex], 29 1 AS [C1] 30 FROM ( SELECT DISTINCT 31 [Extent1].[Sex] AS [Sex] 32 FROM [dbo].[CustomerView] AS [Extent1] 33 ) AS [Distinct1] ) AS [Project2] 34 OUTER APPLY (SELECT [Distinct2].[Province] AS [Province1], [Extent3].[Name] AS [Name], [Extent3].[Sex] AS [Sex], [Extent3].[Province] AS [Province2], [Extent3].[City] AS [City], [Extent3].[County] AS [County], [Extent3].[Street] AS [Street], [Extent3].[Number] AS [Number] 35 FROM (SELECT DISTINCT 36 [Extent2].[Province] AS [Province] 37 FROM [dbo].[CustomerView] AS [Extent2] 38 WHERE ([Project2].[Sex] = [Extent2].[Sex]) OR (([Project2].[Sex] IS NULL) AND ([Extent2].[Sex] IS NULL)) ) AS [Distinct2] 39 LEFT OUTER JOIN [dbo].[CustomerView] AS [Extent3] ON (([Project2].[Sex] = [Extent3].[Sex]) OR (([Project2].[Sex] IS NULL) AND ([Extent3].[Sex] IS NULL))) AND ([Distinct2].[Province] = [Extent3].[Province]) ) AS [Join1] 40 ) AS [Project4] 41 ORDER BY [Project4].[Sex] ASC, [Project4].[C3] ASC, [Project4].[Province] ASC, [Project4].[C2] ASC
上面兩種方法,都保留了每層的分組依據(性別、省份)的值,語法上來看,可能SelectMany更緊湊,Select New更為清晰,不過效率上由於SelectMany的投影操作,所以耗時會更多一些:
不過我們也可以看到,3W條數據,時間差別也並不是很大。如果我們只需要保存最后一層分組依據(省份)的值,把測試代碼改為:
1 // 只保留省份分組Key 2 var result = source 3 .GroupBy(c => c.Sex) 4 .Select(c => c 5 .GroupBy(a => a.Province)); 6 foreach (var items in result) 7 foreach (var item in items) 8 ;
這樣,消耗的時間平均會在309.4ms左右,但就不知道哪個組是哪個性別了:
4、SelectMany vs Double Foreach:測試使用SelectMany和雙重循環來遍歷兩次分組后的數據,並統計生成List<T>,source是:先按姓別分組再按省份分組的數據集,List<T>是:各姓別在各省份的人數,T:Sex,Province,Count。
1 // 臨時數據結果類 2 class TempDTO { public bool? Sex; public string Province; public int Count;} 3 // 數據源 4 var source = _DataContext.CustomerView.GroupBy(c => c.Sex) 5 .Select(c => new 6 { 7 Key = c.Key, 8 Value = c.GroupBy(b => b.Province) 9 }); 10 // SelectMany 11 var temp = source.SelectMany(c => c.Value.Select(b => new TempDTO() 12 { 13 Sex = c.Key, 14 Province = b.Key, 15 Count = b.Count() 16 })).ToList(); 17 // 雙得Foreach 18 List<TempDTO> temp = new List<TempDTO>(); 19 foreach (var items in source) 20 { 21 bool? sex = items.Key; 22 foreach (var item in items.Value) 23 { 24 temp.Add(new TempDTO() 25 { 26 Sex = sex, 27 Province = item.Key, 28 Count = item.Count() 29 }); 30 } 31 }
結果:
產生這么懸殊的結果也出乎我的意料,起初我認為是因為SelectMany中ToList的原因,但是后來更改了測試方法並沒有改變這一現象,而且上面兩種方法得到的結果也是完全一致的。於是我想可能是由於Linq的延時查詢技術在起作用。因為source返回的結果類型是IQuerable<T>,它並沒有真實地查詢,在使用SelectMany時會對生成的SQL語句一起進行優化,而Foreach則是先把source中的每個結果都算了出來,再一個一個地填。驗證的方法很簡單,把source添加一個ToList()就行了:
1 var source = _DataContext.CustomerView.GroupBy(c => c.Sex) 2 .Select(c => new 3 { 4 Key = c.Key, 5 Value = c.GroupBy(b => b.Province) 6 }).ToList();
所得測試結果非常小,都在1ms左右,多次測試難以認定哪種方法更好。於是我增加了50次循環量,所得結果:
可見SelectMany和雙重Foreach在執行效率上實際上是一致的,當然前提說數據源是已經計算好的。
三、總結
寫了一段時間的數據庫,經常會被這些問題困擾,擔心這擔心那,於是便有此文,總的來說,這些方法都差不多,只是在不同的應用環境下(關鍵是:是否要把數據ToList保存起來)有不同的結果。
轉載請注明原址:http://www.cnblogs.com/lekko/archive/2013/01/03/2843080.html