In linear regression models, we generally transform our dependent variable to treat heteroscedasticity, non-normality of errors and non-linear relationship between dependent and independent variables.

To calculate the

To calculate the

**'real'**predicted value, we need to perform**'back transformation'.****Natural Log (base e) Transformation -**The back transformation is to raise e to the power of the number; If the mean of your base-e log-transformed data is 2.65, the back transformed mean is exp(2.65)=14.154**Log base 10 Transformation -**The back transformation is to raise 10 to the power of the number; If the mean of your base-10 log-transformed data is 2.65, the back transformed mean is 10^(2.65)=446.68**Square Root Transformation -**The back transformation is to square the number.**Inverse (Reciprocal) Transformation -**The back transformation is to inverse the number.

**R Code : Back Transformation**#Build Linear Regression Model

library(ggplot2)

library(car)

fit = lm(log(mpg) ~ ., data=mtcars)

pred = predict(fit,mtcars)

head(cbind(log(mtcars$mpg),pred))

**Natural Log Transformed Data - Back Transformation**head(cbind(actual=mtcars$mpg,pred=exp(pred)))

**Comparison of Linear Model and Log Transformed Model**
It is not appropriate to compare linear model with transformed linear model in terms of their fit statistics (RMSE and R2), when their dependent variables were transformed so that units changed, as fit statistics are not comparable.

Comparing R-squared's in this case is like comparing two individuals, A and B, where A eats 65% of a carrot cake and B eats 70% of a strawberry cake. The comparison does not make sense because there are two different cakes.

**One way :**Perform back-transformation and then calculate RMSE and R2 manually on 'real' actual and 'back transformed' predicted values.
## Post a Comment