Missing Imputation with MICE Package in R

Deepanshu Bhalla Add Comment
In R, the mice package has features of imputing missing values on mixed data.

Variable Type with Missing Imputation Methods
  1. For Continuous Data - Predictive mean matching, Bayesian linear regression, Linear regression ignoring model error, Unconditional mean imputation etc.
  2. For Binary Data - Logistic Regression, Logistic regression with bootstrap
  3. For Categorical Data (More than 2 categories) - Polytomous logistic regression, Proportional odds model etc,
  4. For Mixed Data (Can work for both Continuous and Categorical) - CART, Random Forest, Sample (Random sample from the observed values)
anscombe <- within(anscombe, {
y1[1:3] <- NA
y4[3:5] <- NA
imp = mice(anscombe)
imp1 = complete(imp)
Important Points:
  1. By default, the "mice" function creates multiple level (k=5) imputation.
  2. The "complete" function is used to prepare your final data with imputation. By default, it picks first level imputation scores.

Custom mice function
imp = mice(anscombe, m=1)
imp1 = complete(imp, 1)
Default settings in the mice package

If nothing is specified in the method option (as shown in the above example), it checks, by default, the variable type and applies missing imputation method based on the type of variable.
  1. Predictive mean matching (continuous data)
  2. Logistic regression imputation (binary data, factor with 2 levels)
  3. Polytomous regression imputation for unordered categorical data (factor>= 2 levels)
  4. Proportional odds model (ordered, >= 2 levels)

CART : Imputation Algorithm
imp = mice(anscombe, meth = "cart", minbucket = 5)
imp1 = complete(imp)
Random Forest : Imputation Algorithm

Simulations by Shah (Feb 13, 2014) suggested that the quality of the imputation for 10 and 100 trees was identical, so mice 2.22 changed the default number of trees from ntree = 100 to ntree = 10.
imp = mice(anscombe, meth = "rf", ntree = 10)
imp1 = complete(imp)
Important Note : You can ignore minbucket and ntree in the above code. The package can take default values.
Related Posts
Spread the Word!
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "Missing Imputation with MICE Package in R"
Next → ← Prev