Suppose you have two data sets (tables), oldfile and newfile. You want to compare these two datasets and want to see the updated rows and common rows in both the tables.
I . Creating two data sets (tables) - oldfile and newfile
The rows that are highlighted in red are updated rows that do not exist in Old file.
The EXCEPT operator returns rows from the first query that are not part of the second query. It returns updated rows that are not found in the old file.
The INTERSECT operator returns common rows in both the tables.
Data Step Merge : Comparing two datasets
We can compare two datasets with data step merge statement. First we need to sort both the datasets by all the variables and then merge by _all_.
I . Creating two data sets (tables) - oldfile and newfile
data oldfile;II . Comparing two tables - Old File and New File
input id First$;
cards;
5463 Olsen
6574 Hogan
7896 Bridge
4352 Anson
5674 Leach
7902 Wilson
9786 Fabes
;
data newfile;
input id First$;
cards;
5463 Olsen
6574 Hogan
7896 Bridge
4352 Anson
5671 Leach
7900 Wilson
9786 Sampo
2112 Ramav
;
Comparing two datasets |
The EXCEPT operator returns rows from the first query that are not part of the second query. It returns updated rows that are not found in the old file.
/* Comparing two tables - Updated data*/Common Rows in both the tables
proc sql;
title "Updated Rows";
select * from newfile
except
select * from oldfile;
quit;
The INTERSECT operator returns common rows in both the tables.
proc sql;
title "Common Rows";
select * from newfile
intersect
select * from oldfile;
quit;
Data Step Merge : Comparing two datasets
We can compare two datasets with data step merge statement. First we need to sort both the datasets by all the variables and then merge by _all_.
proc sort data = oldfile;Updated Rows
by _all_;
run;
proc sort data = newfile;
by _all_;
run;
data merged;Common Rows
merge oldfile(in=a) newfile(in=b);
by _all_;
if a = 0;
run;
data merged;Related Article : Combine Tables Vertically with PROC SQL
merge oldfile(in=a) newfile(in=b);
by _all_;
if a and b;
run;
I also want to compare two datasets but want to output those attributes for which the values are different only.
ReplyDeletee.g i have two datasets
ReplyDeleteOne:-
a b c
1 2 3
2 3 4
3 4 5
Two:-
1 3 3
2 3 4
3 4 5
so in the output i want to show that b attribute in the 1st row have different value. which wil help me to identify the pattern of wrong output and will help me to work only on those attributes for which my output is not matching with the master data.
data One;
Deleteinput
a b c;
cards;
1 2 3
2 3 4
3 4 5
run;
data Two;
input
_a _b _c;
cards;
1 3 3
2 3 4
3 4 5
run;
data aj;
set one ;
set two;
run;
data dj(drop= a c _a _c);
set a;
where b ne _b;
run;
please send your suggestions as i need the solution very urgently. i am also working on that but not able to capture the erroneous attribute. Please email @pawan.pvs@gmail.com if someone finds something.
ReplyDeleteThanks a lot in advance!!!
Can you please post the results of the query as you have been doing in other sections? Thanks
ReplyDelete