SAS Macro : Imputing Missing Data

When building a predictive model, it is important to impute missing data. There are several ways to treat missing data.

The following is a list of options to impute missing values :
  1. Fill missing values with mean value of the continuous variable (for real numeric values) in which NO outlier exists.
  2. Fill missing values with median value of the continuous variable (for real numeric values) in which outlier exists.
  3. Fill missing values with median value of the ordinal categorical variables 
  4. Fill missing values with mode value of the nominal categorical variables
SAS Macro

The following code fills in missing data with mean/median/mode for each of the variables assigned in the macro and saves it into a new data set.

*****************************************************************/;
************* Imputing Missing Data **************************/;
*****************************************************************/;
*Input : Specify your input dataset name (raw data).
*Stats : Specify mean, median or mode for replacing missing data.
*Vars : Specify your variables in which missing values exist.
- Multiple variables should be seperated by a space.
- The list of variables can be referred as var1-var25.
- For all numeric variables, use _numeric_ keyword.
*Output : Specify dataset where you want ouput file to be saved.
/****************************************************************/;

%macro replace (input= prac.file1,stats=median,vars=Q1-Q5,output=replaced);


* Generate analysis results ;
proc univariate data=&input noprint;
var &vars;
output out=dummy &stats= &vars;
run;

* Convert to vertical ;
proc transpose data=dummy out=dummy;
run;

* Replace missing with analysis results ;
data &output;
set &input;
array vars &vars ;
do i =1 to dim(vars);
set dummy(keep=col1) point= i ;
vars(i)=coalesce(vars(i),col1);
drop col1 ;
end;
run;

%mend;

Options mprint nosymbolgen;
%replace (input= readin1,stats= mode,vars= dbp scl,output=replaced);
Related Posts
About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 8 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Telecom and Human Resource.

0 Response to "SAS Macro : Imputing Missing Data"

Post a Comment

Next → ← Prev
Love this Post? Spread the Word!
Share