This tutorial explains how to use various forms of UNION in PROC SQL in SAS, along with examples.
Create Sample Datasets
The following SAS program creates datasets that will be used to explain examples in this tutorial.
data datasetA; input product $ sales; cards; A 60 A 65 B 36 C 78 D 91 ; run; data datasetB; input product $ profit; cards; A 62 A 62 D 23 E 25 F 32 G 41 ; run;
UNION Operator
The UNION operator within PROC SQL is used to combine datasets by rows in SAS. It does not allow duplicate records in the concatenated dataset.
proc sql; create table newData as select * from datasetA UNION select * from datasetB; quit;
UNION operator combines datasets vertically based on column position rather than column name.
UNION ALL
The ALL keyword keeps the duplicate rows. It is also efficient than the UNION operator as it does not execute the step of removing duplicates.
proc sql; create table newData as select * from datasetA UNION ALL select * from datasetB; quit;
You may have noticed that UNION ALL returned 11 rows in the concatenated dataset, whereas the UNION operator returned 10 rows after removing duplicates.
UNION CORR
The CORR keyword tells SAS to append data based on column name rather than column position.
proc sql; create table newData as select * from datasetA UNION CORR select * from datasetB; quit;
You may have noticed that UNION CORR returned only one column as column 'Product' was the only common column in both the datasets.
OUTER UNION CORR
The OUTER UNION CORR combines data vertically based on column names. It returns all columns from both datasets that includes matched and non-matched columns.
proc sql; create table newData as select * from datasetA OUTER UNION CORR select * from datasetB; quit;
OUTER UNION CORR returned all the columns from both the datasets in the combined dataset.
Share Share Tweet