hive 非等值连接, 设置hive为nonstrict模式


1 数据准备

create table stocks(id int, date string,price string, company string);

insert into table stocks values 
(1,'2010-01-04','214.01','aapl'),
(2,'2010-01-05','214.38','aapl'),
(3,'2010-01-06','210.97','aapl'),
(4,'2010-01-07','210.58','aapl'),
(5,'2010-01-08','211.58','aapl'),
(6,'2010-01-11','210.11','aapl'),
(7,'2010-01-04','132.45','ibm'),
(8,'2010-01-05','138.85','ibm'),
(9,'2010-01-06','129.55','ibm'),
(10,'2010-01-07','130.0','ibm'),
(11,'2010-01-08','130.85','ibm'),
(12,'2006-01-11','121.48','ibm'),
(13,'2007-01-11','120.48','ibm'),
(14,'2008-01-11','123.48','ibm');

2 测试等值连接,通过表的自连接

select a.ymd, a.price, b.price
from
	stocks a 
inner join
	stocks b
on a.ymd = b.ymd
where 
	a.company = 'aapl' and b.company = 'ibm';

  结果为:

2010-01-04	214.01	132.45
2010-01-05	214.38	138.85
2010-01-06	210.97	129.55
2010-01-07	210.58	130.0
2010-01-08	211.58	130.85

3 测试非等值连接,通过表的自连接

select a.ymd,b.ymd, a.price, b.price
from
	stocks a
inner join 
	stocks b
on a.ymd <= b.ymd
where a.company = 'aapl' and b.company = 'ibm'
order by a.ymd asc;

报错如下:

FAILED: SemanticException Cartesian products are disabled for safety reasons. 
If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. 
Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
当前hive运行在strict模式,该模式下:
- 不能进行表的笛卡尔积连接
- order by语句必须带有limit:order by在一个reducer中执行,容易成为性能瓶颈
- 带分区表的查询必须使用分区字段,在where条件中  

解决方式:

set hive.mapred.mode=nonstrict;

之后,再次执行非等值连接即可得到结果:

aapl时间 ibm时间 aapl价格 ibm价格

2010-01-04 2010-01-04 214.01 132.45 2010-01-04 2010-01-05 214.01 138.85 2010-01-05 2010-01-05 214.38 138.85 2010-01-04 2010-01-06 214.01 129.55 2010-01-05 2010-01-06 214.38 129.55 2010-01-06 2010-01-06 210.97 129.55 2010-01-04 2010-01-07 214.01 130.0 2010-01-05 2010-01-07 214.38 130.0 2010-01-06 2010-01-07 210.97 130.0 2010-01-07 2010-01-07 210.58 130.0 2010-01-04 2010-01-08 214.01 130.85 2010-01-05 2010-01-08 214.38 130.85 2010-01-06 2010-01-08 210.97 130.85 2010-01-07 2010-01-08 210.58 130.85 2010-01-08 2010-01-08 211.58 130.85


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM