SQL 行列倒置


SQL的的行列倒置已經不是新知識了,但在博主的技術咨詢期間,仍發現其實有很多人並不了解這塊,所以在此專門寫一篇博客記錄。本文將以Mysql為例,並以數據采集指標信息獲取為例子。在下面的例子,你可以在sqlfiddle運行。

首先我們需要創建數據庫Schema:

    CREATE TABLE Chart
        (`createTime` DateTime, `kpi` varchar(30), `field` varchar(30), `value` double);

    INSERT INTO Chart
        (`createTime`,`kpi`, `field`, `value`)
    VALUES
        ("2015-02-01 12:00:00", 'disk', 'disk', 20),
        ("2015-02-01 12:15:00", 'disk', 'disk', 30),
        ("2015-02-01 12:20:00", 'disk', 'disk', 25),
        ("2015-02-01 12:30:00", 'disk', 'disk', 25),
        ("2015-02-01 12:35:00", 'disk', 'disk', 25),
        ("2015-02-01 12:40:00", 'disk', 'disk', 25),

        ("2015-02-01 12:00:00", 'disk', 'disk-all', 20),
        ("2015-02-01 12:20:00", 'disk', 'disk-all', 30),
        ("2015-02-01 12:25:00", 'disk', 'disk-all', 25),
        ("2015-02-01 12:30:00", 'disk', 'disk-all', 25),
        ("2015-02-01 12:35:00", 'disk', 'disk-all', 25),
        ("2015-02-01 12:40:00", 'disk', 'disk-all', 25),
        ("2015-02-01 12:40:00", 'cpu', 'cpu-all', 25),
        ("2015-02-01 12:40:00", 'cpu', 'cpu', 25)
    ;

在這里字段分別代表:createTime = 數據采集時間,kpi = 數據采集指標,field = 作為指標的小類(一個kpi可以包含多個field),value = 采集的數據

當我們創建好了數據結構,下面因為我們希望獲取出所有的 固定時間范圍內的特定kpi的數據,注意因為可能一個kpi中的多個field,但是某些field漏采了部分時間的數據,所以這里我們需要補充異常點0. 並由於EChart這類圖表庫,希望我們輸入的是橫軸和縱軸為兩個獨立的數組對象表示。所以我們需要如下:

option = {
    ....

    xAxis : [
        {
            type : 'category',
            boundaryGap : false,
            data : ['周一','周二','周三','周四','周五','周六','周日']
        }
    ],
    yAxis : [
        {
            type : 'value',
            axisLabel : {
                formatter: '{value} °C'
            }
        }
    ],
    series : [
        {
            ....
            data:[11, 11, 15, 13, 12, 13, 10]
        },
        {
           ....
            data:[11, 11, 15, 13, 12, 13, 10]
        }
    ]
};

取出橫軸比較容易,如下:

SELECT createTime,kpi, field, value FROM Chart WHERE kpi = 'disk' and (createTime BETWEEN '2015-02-01 12:00:00' AND '2015-02-01 12:25:00');

但是縱軸如果我們以同樣方式取出,可能存在需要我們自動程序補值,並且需要保證每項數據和橫軸對應,所以我們的程序處理會比較復雜,如下:

SELECT createTime,kpi, field, value FROM Chart WHERE kpi = 'disk' and (createTime BETWEEN '2015-02-01 12:00:00' AND '2015-02-01 12:25:00');

結果為:

createTime  kpi field   value
February, 01 2015 12:00:00  disk    disk    20
February, 01 2015 12:15:00  disk    disk    30
February, 01 2015 12:20:00  disk    disk    25
February, 01 2015 12:00:00  disk    disk-all    20
February, 01 2015 12:20:00  disk    disk-all    30
February, 01 2015 12:25:00  disk    disk-all    25

有沒有其他方案更佳的呢?當然那就是本文要說的sql的倒置,如果我們能夠把返回數據轉換為如下:

field   ‘2015-02-01 12:00:00’   ‘2015-02-01 12:15:00’   ‘2015-02-01 12:20:00’   ‘2015-02-01 12:25:00’
disk         20                            30                     25                       0
disk-all     20                             0                     30                       25

那么程序就很好處理了。在上面我們已經能夠取出所有的橫軸數據並排序,接下來我們將可以很簡單的做到行列倒置:如下:

SELECT field,
SUM(IF(createTime = '2015-02-01 12:00:00', value, 0)) as '2015-02-01 12:00:00',
SUM(IF(createTime = '2015-02-01 12:15:00', value, 0)) as '2015-02-01 12:15:00',
SUM(IF(createTime = '2015-02-01 12:20:00', value, 0)) as '2015-02-01 12:20:00',
SUM(IF(createTime = '2015-02-01 12:25:00', value, 0)) as '2015-02-01 12:25:00' 
FROM Chart
WHERE kpi = 'disk' and (createTime BETWEEN '2015-02-01 12:00:00' AND '2015-02-01 12:25:00')
GROUP BY field

這樣返回數據滿足我們的需求了。


下面我們來分析下這句SQL,

  1. 首先我們利用‘IF(createTime = ‘2015-02-01 12:00:00’, value, 0)’來處理插值,並對每行數據轉為以時間為列數據,並可以利用IF來補’0‘,將會如下:

SQL:

SELECT field,
IF(createTime = '2015-02-01 12:00:00', value, 0) as '2015-02-01 12:00:00',
IF(createTime = '2015-02-01 12:15:00', value, 0) as '2015-02-01 12:15:00',
IF(createTime = '2015-02-01 12:20:00', value, 0) as '2015-02-01 12:20:00',
IF(createTime = '2015-02-01 12:25:00', value, 0) as '2015-02-01 12:25:00' 
FROM Chart
WHERE kpi = 'disk' and (createTime BETWEEN '2015-02-01 12:00:00' AND '2015-02-01 12:25:00');

結果為:

field   ‘2015-02-01 12:00:00’   ‘2015-02-01 12:15:00’   ‘2015-02-01 12:20:00’   ‘2015-02-01 12:25:00’
disk               20                       0                       0                       0
disk                0                       30                      0                       0
disk                0                       0                       25                      0
disk-all            20                      0                       0                       0
disk-all            0                       0                       30                      0
disk-all            0                       0                       0                       25
  1. 這下我們就可以利用sql的聚合函數sum和group by來聚合數據行:

SQL:

SELECT field,
SUM(IF(createTime = '2015-02-01 12:00:00', value, 0)) as '2015-02-01 12:00:00',
SUM(IF(createTime = '2015-02-01 12:15:00', value, 0)) as '2015-02-01 12:15:00',
SUM(IF(createTime = '2015-02-01 12:20:00', value, 0)) as '2015-02-01 12:20:00',
SUM(IF(createTime = '2015-02-01 12:25:00', value, 0)) as '2015-02-01 12:25:00' 
FROM Chart
WHERE kpi = 'disk' and (createTime BETWEEN '2015-02-01 12:00:00' AND '2015-02-01 12:25:00')
GROUP BY field

效果如上。

對於sql行列轉置可以簡述為分為兩部分:

  1. 利用條件邏輯(mysql: IF, sql server: case … when(sql server 2005開始支持數據透視表pivot) ..)將 需要倒置的數據變為列。
  2. 利用聚合函數(sum、max、min…)group by 合並數據。這里需要注意max、min需要注意數據的邊界,如存在負數且默認值采用0,那么max就會存在問題,所以一般sum是最安全的(任何數加0都不會改變結果);但對於特定場景max、min也是安全方案。

我們也可以將上面兩次請求合並為一次,這就需要mysql的動態拼接,如下:

SELECT 
@time_sql := group_concat("SUM(IF(createTime = '", t.createTime, "', value, 0)) AS '" , t.createTime, "'")  
FROM (
 SELECT DISTINCT createTime FROM Chart ORDER BY createTime
) AS t;

 set @v_sql = CONCAT("SELECT field", IF(ISNULL(@time_sql) , " ", CONCAT(", ", @time_sql)) ," FROM Chart GROUP BY field");

prepare stmt from @v_sql; 
EXECUTE stmt;   
deallocate prepare stmt; 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM