Hive學習之路 (十三)Hive分析窗口函數(一) SUM,AVG,MIN,MAX


數據准備

數據格式

cookie1,2015-04-10,1
cookie1,2015-04-11,5
cookie1,2015-04-12,7
cookie1,2015-04-13,3
cookie1,2015-04-14,2
cookie1,2015-04-15,4
cookie1,2015-04-16,4

創建數據庫及表

create database if not exists cookie;
use cookie;
drop table if exists cookie1;
create table cookie1(cookieid string, createtime string, pv int) row format delimited fields terminated by ',';
load data local inpath "/home/hadoop/cookie1.txt" into table cookie1;
select * from cookie1;

玩一玩SUM

查詢語句

select 
   cookieid, 
   createtime, 
   pv, 
   sum(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1, 
   sum(pv) over (partition by cookieid order by createtime) as pv2, 
   sum(pv) over (partition by cookieid) as pv3, 
   sum(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4, 
   sum(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5, 
   sum(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6 
from cookie1;

查詢結果

說明

pv1: 分組內從起點到當前行的pv累積,如,11號的pv1=10號的pv+11號的pv, 12號=10號+11號+12號
pv2: 同pv1
pv3: 分組內(cookie1)所有的pv累加
pv4: 分組內當前行+往前3行,如,11號=10號+11號, 12號=10號+11號+12號, 13號=10號+11號+12號+13號, 14號=11號+12號+13號+14號
pv5: 分組內當前行+往前3行+往后1行,如,14號=11號+12號+13號+14號+15號=5+7+3+2+4=21
pv6: 分組內當前行+往后所有行,如,13號=13號+14號+15號+16號=3+2+4+4=13,14號=14號+15號+16號=2+4+4=10

如果不指定ROWS BETWEEN,默認為從起點到當前行;
如果不指定ORDER BY,則將分組內所有值累加;
關鍵是理解ROWS BETWEEN含義,也叫做WINDOW子句:
PRECEDING:往前
FOLLOWING:往后
CURRENT ROW:當前行
UNBOUNDED:起點,

  UNBOUNDED PRECEDING 表示從前面的起點,

  UNBOUNDED FOLLOWING:表示到后面的終點
–其他AVG,MIN,MAX,和SUM用法一樣。

玩一玩AVG

查詢語句

select 
   cookieid, 
   createtime, 
   pv, 
   avg(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1, -- 默認為從起點到當前行
   avg(pv) over (partition by cookieid order by createtime) as pv2, --從起點到當前行,結果同pv1
   avg(pv) over (partition by cookieid) as pv3, --分組內所有行
   avg(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4, --當前行+往前3行
   avg(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5, --當前行+往前3行+往后1行
   avg(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6  --當前行+往后所有行
from cookie1;

查詢結果

玩一玩MIN

查詢語句

select 
   cookieid, 
   createtime, 
   pv, 
   min(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1, -- 默認為從起點到當前行
   min(pv) over (partition by cookieid order by createtime) as pv2, --從起點到當前行,結果同pv1
   min(pv) over (partition by cookieid) as pv3, --分組內所有行
   min(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4, --當前行+往前3行
   min(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5, --當前行+往前3行+往后1行
   min(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6  --當前行+往后所有行
from cookie1;

查詢結果 

玩一玩MAX

查詢語句

select 
   cookieid, 
   createtime, 
   pv, 
   max(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1, -- 默認為從起點到當前行
   max(pv) over (partition by cookieid order by createtime) as pv2, --從起點到當前行,結果同pv1
   max(pv) over (partition by cookieid) as pv3, --分組內所有行
   max(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4, --當前行+往前3行
   max(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5, --當前行+往前3行+往后1行
   max(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6  --當前行+往后所有行
from cookie1;

查詢結果

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM