This tutorial explains how to append datasets using PROC SQL in SAS, along with examples.
The following SAS datasets will be used to explain examples in this tutorial.
data dat1; input x y; cards; 1 6 1 6 1 7 6 4 7 6 8 7 ; run; data dat2; input x z; cards; 1 5 4 2 3 4 6 4 6 5 5 8 ; run;
It displays all rows from both the tables and removes duplicate records from the combined dataset.
- UNION is performed by position not by column name. Hence, common columns in each SELECT statement must be in the same order.
- If CORR keyword is added after UNION, PROC SQL matches the columns by name. Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator.
- ALL keyword allows duplicates in the concatenated dataset.
proc sql; create table out7 as select * from dat1 UNION select * from dat2; quit;
proc sql; create table out8 as select * from dat1 UNION ALL select * from dat2; quit;
proc sql; create table out9 as select * from dat1 UNION CORR select * from dat2; quit;
It appends (concatenates) two tables. It is equivalent to SET statement in Data Step. It allows duplicates in the concatenated table. The ALL keyword is not required with OUTER UNION.
proc sql; create table out10 as select * from dat1 OUTER UNION CORR select * from dat2; quit;
It returns unique rows from the first table that are not found in the second table. (Non-matched Rows). It removes duplicate records (where all columns in the results are the same) - row 2nd in table1.
proc sql; create table out1 as select * from dat1 EXCEPT select * from dat2; quit;
It allows duplicate records in the combined dataset. In simple words, it does not remove duplicates.
proc sql; create table out2 as select * from dat1 EXCEPT ALL select * from dat2; quit;
It displays only columns that have the same name in both the tables.
It returns all unique rows in the first table that do not appear in the second table.
proc sql; create table out3 as select * from dat1 EXCEPT CORR select * from dat2; quit;
The ALL keyword means that SQL will keep all the duplicated rows.
proc sql; create table out3 as select * from dat1 EXCEPT ALL CORR select * from dat2; quit;
It selects unique rows that are common to both the tables.
proc sql; create table out5 as select * from dat1 INTERSECT select * from dat2; quit;
Thanks for providing easy to understand examples and language which helps t grab complex things much easier.
ReplyDeleteGlad you found it helpful. Cheers!
DeleteIt's reallllllly easy to understand
ReplyDeleteThank you for stopping by my blog.
DeleteAppreciate your work
ReplyDeleteIts very good and understandable....
ReplyDeleteThanks a lot :)
ReplyDeleteHi sir,
ReplyDeleteAre you clinical sas programmer?
Thank you for the easy understand example and explanation. It is very good for learners to go through
ReplyDeleteHi
ReplyDeleteCan you please explain the difference between inner join and intersect with same example.
inner join
Deleteinner join is a equi join and we can use non equi join also. when condition is true , matched records are displayed
intersect is used to derived common records in both table
A table;
id
1
2
3
4
B table;
id
1
3
4
inner join :
select * from a, b where a.id=b.id;
select * from a inner join b on a.id=b.id;
intersect
select id from a intersect select id from b;
1
3
4
can you explain CORR with some brief example
ReplyDeleteTHANKS A TON FOR MAKING IT CLEAR AND EASY TO LEARN
ReplyDeleteThanks for sharing it with us! Can you please guide me how to append multiple files using a sql function at once other than typing "Union all corr select * from C"? Thank you.
ReplyDeletevery usefull, thank you
ReplyDeleteI have doubt that, in the example datasets variable names are different then how it is combining them ?
ReplyDeletedoes while combining it only sees the row values and not variable names ?