1 數據准備
create table stocks(id int, date string,price string, company string); insert into table stocks values (1,'2010-01-04','214.01','aapl'), (2,'2010-01-05','214.38','aapl'), (3,'2010-01-06','210.97','aapl'), (4,'2010-01-07','210.58','aapl'), (5,'2010-01-08','211.58','aapl'), (6,'2010-01-11','210.11','aapl'), (7,'2010-01-04','132.45','ibm'), (8,'2010-01-05','138.85','ibm'), (9,'2010-01-06','129.55','ibm'), (10,'2010-01-07','130.0','ibm'), (11,'2010-01-08','130.85','ibm'), (12,'2006-01-11','121.48','ibm'), (13,'2007-01-11','120.48','ibm'), (14,'2008-01-11','123.48','ibm');
2 測試等值連接,通過表的自連接
select a.ymd, a.price, b.price from stocks a inner join stocks b on a.ymd = b.ymd where a.company = 'aapl' and b.company = 'ibm';
結果為:
2010-01-04 214.01 132.45 2010-01-05 214.38 138.85 2010-01-06 210.97 129.55 2010-01-07 210.58 130.0 2010-01-08 211.58 130.85
3 測試非等值連接,通過表的自連接
select a.ymd,b.ymd, a.price, b.price from stocks a inner join stocks b on a.ymd <= b.ymd where a.company = 'aapl' and b.company = 'ibm' order by a.ymd asc;
報錯如下:
FAILED: SemanticException Cartesian products are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features. 當前hive運行在strict模式,該模式下: - 不能進行表的笛卡爾積連接 - order by語句必須帶有limit:order by在一個reducer中執行,容易成為性能瓶頸 - 帶分區表的查詢必須使用分區字段,在where條件中
解決方式:
set hive.mapred.mode=nonstrict;
之后,再次執行非等值連接即可得到結果:
aapl時間 ibm時間 aapl價格 ibm價格
2010-01-04 2010-01-04 214.01 132.45 2010-01-04 2010-01-05 214.01 138.85 2010-01-05 2010-01-05 214.38 138.85 2010-01-04 2010-01-06 214.01 129.55 2010-01-05 2010-01-06 214.38 129.55 2010-01-06 2010-01-06 210.97 129.55 2010-01-04 2010-01-07 214.01 130.0 2010-01-05 2010-01-07 214.38 130.0 2010-01-06 2010-01-07 210.97 130.0 2010-01-07 2010-01-07 210.58 130.0 2010-01-04 2010-01-08 214.01 130.85 2010-01-05 2010-01-08 214.38 130.85 2010-01-06 2010-01-08 210.97 130.85 2010-01-07 2010-01-08 210.58 130.85 2010-01-08 2010-01-08 211.58 130.85
