Correcting Multicollinearity with R

Suppose you want to remove multicollinearity problem in your regression model with R. All the variables having VIF higher than 2.5 are faced with a problem of multicollinearity. In the R custom function below, we are removing the variables with the largest VIF until all variables have VIF less than 2.5.

# reading data from R stored session
mydata = readRDS("logistic.rds")

# Checking number of  rows and columns in data
dim(mydata)

# Loading required packages
library(car)
library(plyr)

# Set dependent variable as numeric
mydata$Ins = as.numeric(mydata$Ins)

# Fit a linear model to the data
fit=lm(Ins ~ AcctAge+DDA + DDABal +CashBk, data=mydata)

# Calculating VIF for each independent variable
vif(fit)

# Set a VIF threshold. All the variables having higher VIF than threshold
#are dropped from the model
threshold=2.5

# Sequentially drop the variable with the largest VIF until
# all variables have VIF less than threshold
drop=TRUE

aftervif=data.frame()
while(drop==TRUE) {
  vfit=vif(fit)
  aftervif=rbind.fill(aftervif,as.data.frame(t(vfit)))
  if(max(vfit)>threshold) { fit=
  update(fit,as.formula(paste(".","~",".","-",names(which.max(vfit))))) }
  else { drop=FALSE }}

# Model after removing correlated Variables
print(fit)

# How variables removed sequentially
t_aftervif= as.data.frame(t(aftervif))
edit(t_aftervif)

# Final (uncorrelated) variables with their VIFs
vfit_d= as.data.frame(vfit)

# Exporting variables
write.csv (vfit_d, "C:\\Users\\Deepanshu Bhalla\\Desktop\\VIF.csv")

R Tutorials : 75 Free R Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

0 Response to "Correcting Multicollinearity with R"

Post a Comment

Next → ← Prev