SAS : Proc Tabulate Explained

Deepanshu Bhalla 11 Comments
What does Proc Tabulate do?

Proc Tabulate is mainly used to create a professional looking table.

Syntax of PROC TABULATE

The basic syntax of PROC TABULATE is as follows:

proc tabulate data = mydata;
class categorical_variable;
var analysis_variable;
table mycolumn;
run;

VAR : The Var statement tells SAS that these variables are analysis variables. They must be numeric. They are used to create summary statistics.

CLASS : The Class statement tells SAS that these variables are categorical variables.

TABLE : The Table statement tells SAS which variables are row expressions, which are column expressions.

Examples : Table Salary; - If there are no commas in the TABLE statement, SAS assumes you are only defining the column expression. Table Gender, Salary; - If there is one comma, then it is a row expression before comma and column expression after comma.
Create a Sample Dataset

The following SAS program creates a dataset which will be used to explain examples in this tutorial.

Data test;
Input T1 T2 T3 T4 T5 Age BU;
Cards;
1 5 2 3 4 3 3
4 5 2 1 2 1 3
3 4 4 3 2 3 2
4 3 2 5 3 3 3
1 2 4 2 1 2 2
;
Run;

The following SAS code uses the PROC TABULATE procedure to create a table using the variable "T1" from the dataset "test".

Proc tabulate data = test;
Var T1;
Table T1;  
Run;

The output is shown in the image below -

SAS : PROC TABULATE

By default, it calculates SUM for variables.

How to add different Statistics Options

The asterisk * is used to add statistical keywords.

Suppose you want to calculate COUNT for T1 variable :

Proc tabulate data = test;
Var T1;
Table T1 * N;
Run;
PROC TABULATE COUNT

Suppose you want to calculate both COUNT and SUM for T1 variable :

Proc tabulate data = test;
Var T1;
Table T1 * (N SUM);
Run;
Statistics Options in PROC TABULATE
Statistics Options in PROC TABULATE
Cross Tab

In the TABLE statement, row expression is specified first then followed by comma and then column expression.

Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 * (N COLPCTN);
Run;

In this SAS code, we generated a table that shows the frequency count and column percentage of variable T1 across different categories of Age in the dataset test. It helps to understand the distribution of variable T1 within each age group.

Transposed Format
Proc Tabulate Data = test;
Class Age;
Var T1;
Table T1, Age * (N ROWPCTN);
Run;
Label Variables and Statistics

The "=" equal sign is an operator most commonly used for formatting.

Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 = "Group I" * (N = "Count" COLPCTN ="%");
Run;

The above SAS program can be written as follows :

Proc Tabulate Data = test;
Class Age;
Var T1;
Keylabel N="Count" COLPCTN="%";
Table Age, T1 = "Group I" * (N COLPCTN); 
Run;

The Keylabel statement will change the label of keywords. Both the above programs produce same output.

Shift the row header up

In order to hide variable or labels, you leave the label specification blank (i.e. =‘ ‘ ).

Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age=" ", T1= "Group I"* (N="Count" COLPCTN="%") / box="Age";
Run;
Adding total rows and columns

The ALL keyword is used to generate a sum total for rows or columns.

Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age ALL= "Grand Total", T1= "Group I"* (N="Count" COLPCTN="%");
Run;
Two level data cuts
Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;

In order to hide variable or statistic labels, you leave the label specification blank (i.e. =‘ ‘ ).

Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age=" " * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;
Cell Formats

You can use the asterisk * to associate the format modifier F= to the summary statistic.

Proc tabulate data = test;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%"; 
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;
Format Statement in Proc Tabulate

The FORMAT statement within the PROC TABULATE procedure is used to apply custom formats to variables in your tables. It shows data in a more readable way.

Proc format;
value agefmt
1 = 'Under 18'
2 = '18 - 25'
3 = 'Over 25';
Run;
Proc format;
value bufmt
1 = 'Analytics'
2 = 'Technology'  
3 = 'Others';
Run;
Proc tabulate data = test;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;  
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;
Three Dimensional Table
Data test;
length BU $18.;
Input Location$ BU$ Gender$ Income;
Cards;
Delhi Analytics Male 5000
Mumbai Tech Female 45000
Delhi Analytics Male 37000
Chennai Tech Male 33000
Delhi Tech Male 5000
Chennai Analytics Male 15000
Mumbai Analytics Female 440000
Delhi Analytics Female 5000
Mumbai Tech Male 45000
Delhi Analytics Female 37000
Chennai Tech Female 33000
Delhi Tech Female 5000
Chennai Analytics Male 15000
; 
Run;
Proc tabulate data = test F=6.0;
Class Location BU Gender;
Var Income;
Table Location=" "*BU=" ", Gender * Income=" "*(N="Count" ROWPCTN="%") / Box="Location BU";
Run;
Save output in a dataset

The OUT= option within the PROC TABULATE procedure is used to save output to a new dataset.

Proc tabulate data = test out=test1;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run; 
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

11 Responses to "SAS : Proc Tabulate Explained"
  1. You are awesome buddy... very useful

    ReplyDelete
  2. Great!!!! very easy to understand

    ReplyDelete
  3. Keep up the good work, bro.

    ReplyDelete
  4. sir ,i have a doubt Iam attending the interview on next month ,what type of questions will ask for 1 year experience data analytists, i was learning sas,r,python pllllllllllllz tell me some guidence for us

    ReplyDelete
  5. I find listendata to be an extraordinary platform when it comes to- understanding any SAS/SQL related stuffs. The examples are presented beautifully. The progression of the complexity is very well maintained, which helps a user to understand the basic logic and relevant implementations first and progressively the more complex ones. Thank you for making life simpler :)

    ReplyDelete
  6. easy to understand, crisp and to the point, thanks a lot

    ReplyDelete
  7. thank you! great explanation

    ReplyDelete
  8. Thank you for this detailed and simple explanation of the proc tabulate procedure

    ReplyDelete
Next → ← Prev