SAS Macro : Imputing Missing Data

Deepanshu Bhalla Add Comment
When building a predictive model, it is important to impute missing data. There are several ways to treat missing data.

The following is a list of options to impute missing values :
  1. Fill missing values with mean value of the continuous variable (for real numeric values) in which NO outlier exists.
  2. Fill missing values with median value of the continuous variable (for real numeric values) in which outlier exists.
  3. Fill missing values with median value of the ordinal categorical variables 
  4. Fill missing values with mode value of the nominal categorical variables
SAS Macro

The following code fills in missing data with mean/median/mode for each of the variables assigned in the macro and saves it into a new data set.

*****************************************************************/;
************* Imputing Missing Data **************************/;
*****************************************************************/;
*Input : Specify your input dataset name (raw data).
*Stats : Specify mean, median or mode for replacing missing data.
*Vars : Specify your variables in which missing values exist.
- Multiple variables should be seperated by a space.
- The list of variables can be referred as var1-var25.
- For all numeric variables, use _numeric_ keyword.
*Output : Specify dataset where you want ouput file to be saved.
/****************************************************************/;

%macro replace (input= prac.file1,stats=median,vars=Q1-Q5,output=replaced);


* Generate analysis results ;
proc univariate data=&input noprint;
var &vars;
output out=dummy &stats= &vars;
run;

* Convert to vertical ;
proc transpose data=dummy out=dummy;
run;

* Replace missing with analysis results ;
data &output;
set &input;
array vars &vars ;
do i =1 to dim(vars);
set dummy(keep=col1) point= i ;
vars(i)=coalesce(vars(i),col1);
drop col1 ;
end;
run;

%mend;

Options mprint nosymbolgen;
%replace (input= readin1,stats= mode,vars= dbp scl,output=replaced);
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "SAS Macro : Imputing Missing Data"
Next → ← Prev