Checking Homoscedasticity with SAS

Live Online Training : SAS Programming with 50+ Case Studies

- Explain Programming Concepts in Simple English
- Live Projects
- Scenario Based Questions
- Job Placement Assistance
- Get 20% off till July 14, 2017

In a linear regression model, there should be homogeneity of variance of the residuals. In other words, the variance of residuals are approximately equal for all predicted dependent variable values.


The Variation in income increases with years of work experience.

Income with work experience 4 years: 30,40,60 with absolute difference 10, 30 and relative difference 33%,100% and log difference 0.29, 0.69.

Income at work experience 8 years: 90,120, 180 with absolute difference 30, 90 and relative difference 33%, 100% and log difference 0.29, 0.69

Note : Often after log transformation of dependent variable makes variance constant.

Consequences of Heteroscedasticity
The regression prediction remains unbiased and consistent but inefficient. It is inefficient because the estimators are no longer the Best Linear Unbiased Estimators (BLUE). The hypothesis tests (t-test and F-test) are no longer valid.

How to check Homoscedasticity

1. White Test - This statistic is asymptotically distributed as chi-square with k-1 degrees of freedom, where k is the number of regressors, excluding the constant term.

2. Breusch-Pagan test

3. Lagrange multiplier (LM) test

With PROC AUTOREG (LM Test and Supports CLASS Statement)
proc autoreg data= bhalla.GLMSELECT;
model crime = yr_rnd mealcat some_col / archtest;
output out=r r=yresid;
Note : Check P-value of Q statistics and LM tests. P-value greater than .05 indicates homoscedasticity.

With PROC MODEL (White and PAGAN Test)
proc model data= bhalla.GLMSELECT;
parms a1 b1 b2 b3;
api00 = a1 + b1*yr_rnd + b2*mealcat + b3*some_col;
fit api00 / white pagan=(1 yr_rnd mealcat some_col)
out=resid1 outresid;
If the p-value of white test and Breusch-Pagan test is greater than .05, the homogenity of variance of residual has been met (Homoscedasticity).

Note : PROC AUTOREG supports CLASS statement.

Remedy : 

1. Box-Cox transformations of the dependent variable

Box-Cox transformations are used to find potentially nonlinear transformations of a dependent variable.
MODEL BOXCOX(api00) = IDENTITY(yr_rnd mealcat some_col);
 Note : Categorical variables can be used with CLASS statement instead of IDENTITY.
Check Lambda score generated from PROC TRANSREG

Transformation Best Lambda
Square 1.5 to 2.5
None 0.75 to 1.5
Square-root 0.25 to 0.75
Natural log -0.25 to 0.25
Inverse square-root -0.75 to -0.25
Reciprocal -1.5 to -0.75
Inverse square -2.5 to -1.5

2. Weighted Least Squares
If variable transformation does not solve the problem, we can use weighted least squares.
How to construct weights :

  1. Compute the absolute and squared residuals
  2. Find the absolute and squared residuals vs. independent variables to get the estimated standard deviation and variance
  3. Compute the weights using the estimated standard deviations and variance.

SAS Code (Source)

proc reg data=Prob7870.Blood_pr;
   model Y=X;
   output out=WORK.PRED r=residual;

data work.resid;
  set work.pred;

proc reg data=work.resid;
    model absresid=X;
    output out=WORK.s_weights p=s_hat;
   model sqresid=X;
    output out=WORK.v_weights p=v_hat;

** compute the weights using the estimated standard deviations**;
data work.s_weights;
set work.s_weights;
label s_weight = "weights using absolute residuals";

** compute the weights using the estimated variances**;
data work.v_weights;
set work.v_weights;
label v_weight = "weights using squared residuals";

** Do the weighted least squares using the weights from the estimated standard deviation**;
proc reg data=work.s_weights;
weight s_weight;
model Y = X;

** Do the weighted least squares using the weights from the estimated variances**;
proc reg data=work.v_weights;
weight v_weight;
model Y = X;

Related Posts : 
  1. Checking Assumptions of Multiple Linear Regression with SAS
  2. Linear Regression Model with PROC GLMSELECT
  3. Scoring Linear Regression Model with SAS

SAS Tutorials : 100 Free SAS Tutorials

Statistics Tutorials : 50 Statistics Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.

While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

2 Responses to "Checking Homoscedasticity with SAS"

  1. This comment has been removed by the author.

  2. Can you provide sample data sets for person to run codes on


Next → ← Prev