SAS Macro : Imputing Missing Data


SAS Analytics : Practical SAS, Statistics & Analysis Course

When building a predictive model, it is important to impute missing data. There are several ways to treat missing data.

The following is a list of options to impute missing values :
  1. Fill missing values with mean value of the continuous variable (for real numeric values) in which NO outlier exists.
  2. Fill missing values with median value of the continuous variable (for real numeric values) in which outlier exists.
  3. Fill missing values with median value of the ordinal categorical variables 
  4. Fill missing values with mode value of the nominal categorical variables
SAS Macro

The following code fills in missing data with mean/median/mode for each of the variables assigned in the macro and saves it into a new data set.

*****************************************************************/;
************* Imputing Missing Data **************************/;
*****************************************************************/;
*Input : Specify your input dataset name (raw data).
*Stats : Specify mean, median or mode for replacing missing data.
*Vars : Specify your variables in which missing values exist.
- Multiple variables should be seperated by a space.
- The list of variables can be referred as var1-var25.
- For all numeric variables, use _numeric_ keyword.
*Output : Specify dataset where you want ouput file to be saved.
/****************************************************************/;

%macro replace (input= prac.file1,stats=median,vars=Q1-Q5,output=replaced);


* Generate analysis results ;
proc univariate data=&input noprint;
var &vars;
output out=dummy &stats= &vars;
run;

* Convert to vertical ;
proc transpose data=dummy out=dummy;
run;

* Replace missing with analysis results ;
data &output;
set &input;
array vars &vars ;
do i =1 to dim(vars);
set dummy(keep=col1) point= i ;
vars(i)=coalesce(vars(i),col1);
drop col1 ;
end;
run;

%mend;

Options mprint nosymbolgen;
%replace (input= readin1,stats= mode,vars= dbp scl,output=replaced);
Coursera Data Science

SAS Tutorials : 100 Free SAS Tutorials

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

0 Response to "SAS Macro : Imputing Missing Data"

Post a Comment

Next → ← Prev