Predicting Transformed Dependent Variable

In linear regression models, we generally transform our dependent variable to treat heteroscedasticity, non-normality of errors and non-linear relationship between dependent and independent variables.

To calculate the 'real' predicted value, we need to perform 'back transformation'.

Natural Log (base e) Transformation - The back transformation is to raise e to the power of the number; If the mean of your base-e log-transformed data is 2.65, the back transformed mean is exp(2.65)=14.154
Log base 10 Transformation - The back transformation is to raise 10 to the power of the number; If the mean of your base-10 log-transformed data is 2.65, the back transformed mean is 10^(2.65)=446.68
Square Root Transformation - The back transformation is to square the number.
Inverse (Reciprocal) Transformation - The back transformation is to inverse the number.

R Code : Back Transformation

#Build Linear Regression Model
library(ggplot2)
library(car)
fit = lm(log(mpg) ~ ., data=mtcars)
pred = predict(fit,mtcars)
head(cbind(log(mtcars$mpg),pred))

Natural Log Transformed Data - Back Transformation

head(cbind(actual=mtcars$mpg,pred=exp(pred)))

Comparison of Linear Model and Log Transformed Model

It is not appropriate to compare linear model with transformed linear model in terms of their fit statistics (RMSE and R2), when their dependent variables were transformed so that units changed, as fit statistics are not comparable.

Comparing R-squared's in this case is like comparing two individuals, A and B, where A eats 65% of a carrot cake and B eats 70% of a strawberry cake. The comparison does not make sense because there are two different cakes.

One way : Perform back-transformation and then calculate RMSE and R2 manually on 'real' actual and 'back transformed' predicted values.

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 1 Response to "Predicting Transformed Dependent Variable"

MichelleOctober 30, 2020 at 12:31 PM
This isn't helpful at all. I need to generate a backtransformed regression line, which means that I need to apply the transformed regression to a new data set, then backtransform it, then draw a new regression line through the backtransformed data. I can't believe I disabled ad blocker for this.