This tutorial explains how to combine / append data sets vertically with PROC SQL. Suppose you have two data sets and we need to combine these two datasets vertically. For example, if a dataset A contains 10 records and dataset B contains 10 records. I want combined dataset would contain 20 records.
Create data sets in SAS
data dat1;
input x y;
cards;
1 6
1 6
1 7
6 4
7 6
8 7
;
run;
data dat2;
input x z;
cards;
1 5
4 2
3 4
6 4
6 5
5 8
;
run;
 |
Input Data Sets |
1. UNION Operator
It displays all rows from both the tables and removes duplicate records from the combined dataset. By adding
ALL keyword, it allows
duplicate rows in the combined dataset.
Important Point
UNION is performed by position not by column name. Hence, common columns in each SELECT statement should be in the same order. If CORR keyword is included, PROC SQL matches the columns by name.
ALL Keyword
ALL keyword allows duplicates in the concatenated dataset.
CORR Keyword
CORR keyword tells SAS to match the columns in table by name and not by position. Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator
 |
UNION Operator |
proc sql;
create table out7 as
select *
from dat1
UNION
select *
from dat2;
quit;
proc sql;
create table out8 as
select *
from dat1
UNION ALL
select *
from dat2;
quit;
proc sql;
create table out9 as
select *
from dat1
UNION CORR
select *
from dat2;
quit;
2. OUTER UNION CORR
It appends (concatenates) two tables. It is equivalent to
SET statement in Data Step. It allows duplicates in the concatenated table.
The ALL keyword is not required with OUTER UNION.
proc sql;
create table out10 as
select *
from dat1
OUTER UNION CORR
select *
from dat2;
quit;
 |
OUTER UNION CORR |
3. Except Operator
It returns
unique rows from the first query that are not found in the second query.
(Non matched Rows). It
removes duplicate records (where all columns in the results are the same) - row 2nd in table1.
 |
Except Operator |
proc sql;
create table out1 as
select *
from dat1
EXCEPT
select *
from dat2;
quit;
Except ALL
It allows duplicate records in the combined dataset and does not remove duplicates.
proc sql;
create table out2 as
select *
from dat1
EXCEPT ALL
select *
from dat2;
quit;
 |
Except ALL |
Except CORR
It displays only columns that have the same name (or
common) in both the tables.
 |
Except CORR |
It returns all unique rows in the first table (
based on the common column) that do not appear in the second table.
proc sql;
create table out3 as
select *
from dat1
EXCEPT CORR
select *
from dat2;
quit;
Except ALL CORR
 |
Except ALL Corr |
4. INTERSECT Operator
It selects unique rows that are common to both the tables.
proc sql;
create table out5 as
select *
from dat1
INTERSECT
select *
from dat2;
quit;
 |
SQL : Intersect Operator |
About Author:
Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.
Thanks for providing easy to understand examples and language which helps t grab complex things much easier.
ReplyDeleteGlad you found it helpful. Cheers!
DeleteIt's reallllllly easy to understand
ReplyDeleteThank you for stopping by my blog.
DeleteAppreciate your work
ReplyDeleteIts very good and understandable....
ReplyDeleteThanks a lot :)
ReplyDeleteHi sir,
ReplyDeleteAre you clinical sas programmer?
This comment has been removed by the author.
DeleteThank you for the easy understand example and explanation. It is very good for learners to go through
ReplyDeleteHi
ReplyDeleteCan you please explain the difference between inner join and intersect with same example.
inner join
Deleteinner join is a equi join and we can use non equi join also. when condition is true , matched records are displayed
intersect is used to derived common records in both table
A table;
id
1
2
3
4
B table;
id
1
3
4
inner join :
select * from a, b where a.id=b.id;
select * from a inner join b on a.id=b.id;
intersect
select id from a intersect select id from b;
1
3
4
can you explain CORR with some brief example
ReplyDeleteTHANKS A TON FOR MAKING IT CLEAR AND EASY TO LEARN
ReplyDeleteThanks for sharing it with us! Can you please guide me how to append multiple files using a sql function at once other than typing "Union all corr select * from C"? Thank you.
ReplyDeletevery usefull, thank you
ReplyDeleteI have doubt that, in the example datasets variable names are different then how it is combining them ?
ReplyDeletedoes while combining it only sees the row values and not variable names ?