How to standardize variables in SAS

In this tutorial we will cover how to standardize variables in SAS using PROC STDIZE.

Standardization

Standardization refers to subtracting the variable's mean and dividing it by the variable's standard deviation. The purpose of standardization is to transform numerical variables to a common scale, making them more easily comparable. Standardization removes the original units of measurement and centers the data around zero, with a standard deviation of one.

How to Use PROC STDIZE to Standardize Data

By using PROC STDIZE with the METHOD=STD method, we can standardize variables using the sample mean and the sample standard deviation. In the example below, we are using sashelp.class dataset and standardizing the Height and Weight variables.

proc stdize data=sashelp.class out=readin method=std;
var Height Weight;
run;

How to validate Standardization

To confirm that standardization has been applied correctly to variables, you can calculate the mean and standard deviation of the standardized variables. Verify that the mean of the standardized variables is approximately zero and the standard deviation is approximately one.

PROC MEANS is a SAS procedure used for calculating mean and standard deviations for one or more variables in a dataset. Here we are using the output dataset readin generated from PROC STDIZE.

proc means data=readin Mean StdDev ndec=2; 
var Height Weight;
run;
Standardize variables in SAS

As shown in the image above, both the standardized variables have mean=0 and standard deviation=1

Standardization by Group

To apply the standardization by group, we can use the BY statement in PROC STDIZE. In this case, the variable "Sex" serves as the grouping variable with two distinct categories: Male and Female.

Make sure to sort the grouping variable before using the BY statement in PROC STDIZE. You can sort data using PROC SORT procedure. Sorting is not necessary if the data is already arranged by the grouping variable.

proc sort data = sashelp.class out=students;
by Sex;
run;

proc stdize data=students out=readin method=std;
var Height Weight;
by Sex;
run;

proc means data=readin Mean StdDev ndec=2; 
class Sex;
var Height Weight;
run;
proc stdize by group

Each group has mean=0 and standard deviation=1 for both variables Height and Weight.

Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

0 Response to "How to standardize variables in SAS"

Post a Comment

Next → ← Prev