SQL Server 之 GROUP BY、GROUPING SETS、ROLLUP、CUBE


1.創建表 Staff

CREATE TABLE [dbo].[Staff](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](50) NULL,
    [Sex] [varchar](50) NULL,
    [Department] [varchar](50) NULL,
    [Money] [int] NULL,
    [CreateDate] [datetime] NULL
) ON [PRIMARY]

GO

 

2.為Staff表填充數據

INSERT INTO [dbo].[Staff]([Name],[Sex],[Department],[Money],[CreateDate])
SELECT 'Name1','','技術部',3000,'2011-11-12'
UNION ALL
SELECT 'Name2','','工程部',4000,'2013-11-12'
UNION ALL
SELECT 'Name3','','工程部',3000,'2013-11-12'
UNION ALL
SELECT 'Name4','','技術部',5000,'2012-11-12'
UNION ALL
SELECT 'Name5','','技術部',6000,'2011-11-12'
UNION ALL
SELECT 'Name6','','技術部',4000,'2013-11-12'
UNION ALL
SELECT 'Name7','','技術部',5000,'2012-11-12'
UNION ALL
SELECT 'Name8','','工程部',3000,'2012-11-12'
UNION ALL
SELECT 'Name9','','工程部',6000,'2011-11-12'
UNION ALL
SELECT 'Name10','','工程部',3000,'2011-11-12'
UNION ALL
SELECT 'Name11','','技術部',3000,'2011-11-12'
 

 

GROUP BY 分組查詢, 一般和聚合函數配合使用

SELECT  [DEPARTMENT],SEX, COUNT(1)
FROM DBO.[STAFF] 
GROUP BY SEX, [DEPARTMENT]  

該段SQL是用於查詢   某個部門下的男女員工數量 其數據結果如下

開銷比較大

 

GROUPING SETS

使用 GROUPING SETS 的 GROUP BY 子句可以生成一個等效於由多個簡單 GROUP BY 子句的 UNION ALL 生成的結果集,並且其效率比 GROUP BY 要高,SQL Server 2008引入。

1.使用GROUP BY 子句的 UNION ALL 來統計 Staff 表中的性別、部門、薪資、入職年份

SET STATISTICS IO ON  
SET STATISTICS TIME ON

SELECT N'總人數' ,'',COUNT(0) FROM [DBO].[STAFF]
UNION ALL  
SELECT N'按性別划分', SEX,COUNT(0) FROM  [DBO].[STAFF] GROUP BY SEX  
UNION ALL  
SELECT N'按部門統計',[DEPARTMENT],COUNT(0) FROM  [DBO].[STAFF] GROUP BY [DEPARTMENT]  
UNION ALL  
SELECT N'按薪資統計',CONVERT(VARCHAR(10),[MONEY]),COUNT(0) FROM  [DBO].[STAFF] GROUP BY  [MONEY] 
UNION ALL  
SELECT N'按入職年份',CONVERT(VARCHAR(10),YEAR([CREATEDATE])),COUNT(0) FROM  [DBO].[STAFF] GROUP BY YEAR([CREATEDATE])  
 


2.換成GROUPING SETS的寫法

SET STATISTICS IO ON  
SET STATISTICS TIME ON  
GO
SELECT (CASE  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=15 THEN N'總人數' 
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=7 THEN N'按性別划分'  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=11 THEN N'按部門統計'  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=13 THEN N'按薪資統計'   
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=14 THEN N'按入職年份'   
END  
),
(CASE  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=15 THEN ''
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=7 THEN SEX  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=11 THEN [DEPARTMENT]  
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=13 THEN CONVERT(VARCHAR(10),[MONEY])   
WHEN GROUPING_ID(SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]))=14 THEN CONVERT(VARCHAR(10),YEAR([CREATEDATE]))   
END  
) 
,
COUNT(1) 
FROM DBO.[STAFF]
GROUP BY GROUPING SETS (SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]),())
 

從上述結果中可以看出,采用UNION ALL 是多次掃描表,並將掃描后的查詢結果進行組合操作,會增加IO開銷,減少CPU和內存開銷。

采用GROUPING SETS 是一次性讀取所有數據,並在內存中進行聚合操作生成結果,減少IO開銷,對CPU和內存消耗增加。但GROUPING SETS 在多列分組時,其性能會比group by高。

這里掃描四次是因為我 GROUP BY GROUPING SETS (SEX,[DEPARTMENT],[MONEY],YEAR([CREATEDATE]),()) 了四列

 

ROLLUP與CUBE 

ROLLUP與CUBE  按一定的規則產生多種分組,然后按各種分組統計數據

ROLLUP與CUBE 區別:

  CUBE 會對所有的分組字段進行統計,然后合計。

  ROLLUP 按照分組順序,對第一個字段進行組內統計,最后給出合計。
 
下面看我查詢 
SELECT  
      CASE WHEN (GROUPING(SEX) = 1) THEN '統計-ROLLUP' 
            ELSE ISNULL(SEX, 'UNKNOWN') 
       END AS SEX ,
        COUNT(0)
FROM DBO.[STAFF] 
GROUP BY   SEX   WITH ROLLUP

SELECT  
      CASE WHEN (GROUPING(SEX) = 1) THEN '統計-CUBE' 
            ELSE ISNULL(SEX, 'UNKNOWN') 
       END AS SEX ,
        COUNT(0)
FROM DBO.[STAFF] 
GROUP BY   SEX   WITH CUBE

看不出差別,我們再加一列
SELECT  
      CASE WHEN (GROUPING(SEX) = 1) THEN '統計-ROLLUP' 
            ELSE ISNULL(SEX, 'UNKNOWN') 
       END AS SEX , 
      CASE WHEN (GROUPING([DEPARTMENT]) = 1) THEN '統計-ROLLUP' 
            ELSE ISNULL([DEPARTMENT], 'UNKNOWN') 
       END AS [DEPARTMENT], 
        COUNT(0) 
FROM DBO.[STAFF] 
GROUP BY   SEX,[DEPARTMENT]   WITH ROLLUP

SELECT  
      CASE WHEN (GROUPING(SEX) = 1) THEN '統計-CUBE' 
            ELSE ISNULL(SEX, 'UNKNOWN') 
       END AS SEX ,
      CASE WHEN (GROUPING([DEPARTMENT]) = 1) THEN  '統計-CUBE' 
            ELSE ISNULL([DEPARTMENT], 'UNKNOWN') 
       END AS [DEPARTMENT], 
        COUNT(0) 
FROM DBO.[STAFF] 
GROUP BY   SEX,[DEPARTMENT]  WITH CUBE

可以看出 使用 ROLLUP 會先統計分組下的,然后在對GROUP BY的第一列字段進行統計,最后計算總數,而 CUBE 則是先分組統計,然后統計GRUOP BY 的每個字段,最后進行匯總。

 

 http://www.cnblogs.com/woxpp/p/4688715.html 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM