SAS 中的多表使用一般為多表之間的連接查詢
連接查詢分為:
(1)表內連接
(2)表與表之間的連接
這里我們只討論表與表之間的連接,而表與表之間的連接分為:
(1)全連接
這里的全連接就是基本的笛卡爾積,笛卡爾積通俗一點說就是每條數據的交叉相乘,使用full join
1 data temp; 2 input visit $ visit_dat $ age type $; 3 cards; 4 v1 20190201 18 a 5 v2 20200304 21 f 6 v3 20190825 34 e 7 v1 20180431 58 c 8 v2 20170902 23 d 9 v4 20160826 25 r 10 ; 11 run; 12 13 data temp2; 14 input visit $ visit_dat $ age type $; 15 cards; 16 v3 20190725 25 d 17 v4 20200431 35 e 18 v5 20190921 38 g 19 ; 20 run; 21 22 /* 創建表,含條件語句 */ 23 proc sql; 24 create table join_visit as 25 select 26 table1.visit as visit1, 27 table1.visit_dat as visit_dat1, 28 table1.age as age1, 29 table1.type as type1, 30 table2.visit as visit1, 31 table2.visit_dat as visit_dat2, 32 table2.age as age2, 33 table2.type as type2 34 from 35 work.temp as table1 36 full join 37 work.temp2 as table2 38 on 39 table1.visit = table2.visit; 40 run; 41 42 /* 打印創建的表 */ 43 proc print data=join_visit; 44 run;
結果為:
(2)左外連接
保留左邊的所有數據,對於右邊的,只保留符合條件的數據,使用 left join
1 data temp; 2 input visit $ visit_dat $ age type $; 3 cards; 4 v1 20190201 18 a 5 v2 20200304 21 f 6 v3 20190825 34 e 7 v1 20180431 58 c 8 v2 20170902 23 d 9 v4 20160826 25 r 10 ; 11 run; 12 13 data temp2; 14 input visit $ visit_dat $ age type $; 15 cards; 16 v3 20190725 25 d 17 v4 20200431 35 e 18 v5 20190921 38 g 19 ; 20 run; 21 22 /* 創建表,含條件語句 */ 23 proc sql; 24 create table join_visit as 25 select 26 table1.visit as visit1, 27 table1.visit_dat as visit_dat1, 28 table1.age as age1, 29 table1.type as type1, 30 table2.visit as visit1, 31 table2.visit_dat as visit_dat2, 32 table2.age as age2, 33 table2.type as type2 34 from 35 work.temp as table1 36 left join 37 work.temp2 as table2 38 on 39 table1.visit = table2.visit; 40 run; 41 42 /* 打印創建的表 */ 43 proc print data=join_visit; 44 run;
(3)右外連接
保留右邊的所有數據,對於左邊的,只保留符合條件的數據,使用 right join
data temp; input visit $ visit_dat $ age type $; cards; v1 20190201 18 a v2 20200304 21 f v3 20190825 34 e v1 20180431 58 c v2 20170902 23 d v4 20160826 25 r ; run; data temp2; input visit $ visit_dat $ age type $; cards; v3 20190725 25 d v4 20200431 35 e v5 20190921 38 g ; run; /* 創建表,含條件語句 */ proc sql; create table join_visit as select table1.visit as visit1, table1.visit_dat as visit_dat1, table1.age as age1, table1.type as type1, table2.visit as visit1, table2.visit_dat as visit_dat2, table2.age as age2, table2.type as type2 from work.temp as table1 right join work.temp2 as table2 on table1.visit = table2.visit; run; /* 打印創建的表 */ proc print data=join_visit; run;
結果為:
注意:
要注意的是,多表連接使用的時候,重要的不是語句的使用,而是選擇哪一個連接,才是最重要的。