SAS : Proc Tabulate Explained

Live Online Training : SAS Programming with 50+ Case Studies

- Explain Programming Concepts in Simple English
- Live Projects
- Scenario Based Questions
- Job Placement Assistance
- Get 10% off till Sept 25, 2017
- Batch starts from October 8, 2017

What does Proc Tabulate do?
Proc Tabulate is mainly used to create a professional looking table.

Terminologies
VAR : The Var statement tells SAS that these variables are analysis variables. They must be numeric. They are used to create summary statistics.

CLASS : The Class statement tells SAS that these variables are categorical variables.

TABLE : The Table statement tells SAS which variables are row expressions, which are column expressions.
Table Salary; -  If there are no commas in the TABLE statement, SAS assumes you are only defining the column expression. 
Table Gender, Salary; -  If there is one comma, then it is a row expression before comma and column expression after comma.

Create a dataset
Data test;
Input T1 T2 T3 T4 T5 Age BU;
Cards;
1 5 2 3 4 3 3
4 5 2 1 2 1 3
3 4 4 3 2 3 2
4 3 2 5 3 3 3
1 2 4 2 1 2 2
;
Run;
Simple Table
Proc tabulate data = test;
Var T1;                                                                                          
Table T1;          
Run;
The output is shown in the image below -


By default, it calculates SUM for variables.

How to add different Statistics Options

The asterisk * is used to add statistical keywords.

Suppose you want to calculate COUNT for T1 variable :
Proc tabulate data = test;
Var T1;
Table T1 * N;
Run;

Suppose you want to calculate both COUNT and SUM for T1 variable :
Proc tabulate data = test;
Var T1;
Table T1 * (N SUM);
Run;

Statistics Options


Cross Tab

In the TABLE statement, row expression is specified first then followed by comma and then column expression.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 * (N COLPCTN);
Run;

Transposed Format
Proc Tabulate Data = test;
Class Age;
Var T1;
Table T1, Age * (N ROWPCTN);
Run;

Label Variables and Statistics

The "=" equal sign is an operator most commonly used for formatting.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 = "Group I" * (N="Count" COLPCTN="%");
Run;
The above SAS program can be written as follows :
Proc Tabulate Data = test;
Class Age;
Var T1;                                                                              
Keylabel N="Count" COLPCTN="%";
Table Age, T1 = "Group I" * (N COLPCTN);
Run; 
The Keylabel statement will change the label of keywords. Both the above programs produce same output.



Shift the row header up

In order to hide variable or labels, you leave the label specification blank (i.e. =‘ ‘ ).
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age=" ", T1 = "Group I" * (N="Count" COLPCTN="%") / box="Age";
Run;


Adding total rows and columns 

The ALL keyword is used to generate a sum total for rows or columns.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age ALL = "Grand Total" , T1 = "Group I" * (N="Count" COLPCTN="%");
Run;

Two level data cuts
Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;

In order to hide variable or statistic labels, you leave the label specification blank (i.e. =‘ ‘ ). 
Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age=" " * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;

Cell Formats

You can use the asterisk *  to associate the format modifier “F=” to the summary statistic.
Proc tabulate data = test;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;

Format Statement with Proc Tabulate
Proc format;                                                                                                                          
value agefmt                                                                                                                          
1 = 'Under 18'                                                                                                                      
2 = '18 - 25'                                                                                                                        
3 = 'Over 25';                                                                                                                      
Run;                                                                                                                                  
                                                                                                                                     
Proc format;                                                                                                                          
value bufmt                                                                                                                          
1 = 'Analytics'                                                                                                                      
2 = 'Technology'                                                                                                                    
3 = 'Others';                                                                                                                        
Run;
                                                                                 
Proc tabulate data = test;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;                                                                                                                    
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run; 


Three Dimensional Table
Data test;
length BU $18.;
Input Location$ BU$ Gender$ Income;
Cards;
Delhi Analytics Male 5000
Mumbai Tech Female 45000
Delhi Analytics Male 37000
Chennai Tech Male 33000
Delhi Tech Male 5000
Chennai Analytics Male 15000
Mumbai Analytics Female 440000
Delhi Analytics Female 5000
Mumbai Tech Male 45000
Delhi Analytics Female 37000
Chennai Tech Female 33000
Delhi Tech Female 5000
Chennai Analytics Male 15000
;                                                                                                                                    
Run;                                                                         
Proc tabulate data = test F=6.0;
Class Location BU Gender;
Var Income;
Table Location=" "*BU=" ", Gender * Income=" "*(N="Count" ROWPCTN="%") / Box="Location BU";
Run;


Save output in a dataset
Proc tabulate data = test out=test1;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;

SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

3 Responses to "SAS : Proc Tabulate Explained"

  1. You are awesome buddy... very useful

    ReplyDelete
  2. Great!!!! very easy to understand

    ReplyDelete
  3. Keep up the good work, bro.

    ReplyDelete

Next → ← Prev