In this tutorial, we will cover how to use PROC COMPARE in SAS, along with examples.
Introduction : PROC COMPARE
PROC COMPARE in SAS is used to compare the contents and structure of two datasets. It returns a summary of both the similarities and differences found between two datasets.
Syntax of PROC COMPARE
The syntax of PROC COMPARE is as follows:
proc compare base = data1 compare = data2; run;
This compares the datasets data1 and data2 and displays the differences between them. By default, PROC COMPARE compares all the variables in the datasets.
Let's compare two built-in SAS datasets: sashelp.class
and sashelp.classfit
.
proc compare base = sashelp.class compare = sashelp.classfit; run;
In the dataset summary section, it shows the comparison of the structure of both the datasets and returns the following analysis.
- Dataset Creation Dates
- Dataset Modification Dates
- Number of Variables
- Number of Observations
- Labels
In the Variable Summary section, it shows how many variables which are common in both the datasets and how many variables are in one dataset but not in the other dataset.
In the Observation Summary section, it displays how many observations are in both the datasets and how many of them have equal or unequal values in some or all of the variables.
In the "Values Comparison summary" section, it displays summary about the variables that either have all values exactly equal or contain some unequal values.
How to Compare Specific Variables
You can use the VAR
statement in PROC COMPARE to compare specific variables of both the datasets. Please note that the initial summary about dataset and variables remain unchanged. You should focus on the Value Comparison Results. In the code below, we are comparing "name" variable in both the datasets.
proc compare base = sashelp.class compare = sashelp.classfit; var name; run;
How to Compare Only Structure of Datasets
By using NOVALUES
option in PROC COMPARE, we can tell SAS not to compare values between the two datasets. In short it returns only the similarities and difference in the variables, not values. The LISTVAR
option is used to list the variables which are in one dataset but not in the other dataset.
proc compare base = sashelp.class compare = sashelp.classfit novalues listvar; run;
Share Share Tweet