SAS : Proc Tabulate Explained

What does Proc Tabulate do?
Proc Tabulate is mainly used to create a professional looking table.

Terminologies
VAR : The Var statement tells SAS that these variables are analysis variables. They must be numeric. They are used to create summary statistics.

CLASS : The Class statement tells SAS that these variables are categorical variables.

TABLE : The Table statement tells SAS which variables are row expressions, which are column expressions.
Table Salary; -  If there are no commas in the TABLE statement, SAS assumes you are only defining the column expression. 
Table Gender, Salary; -  If there is one comma, then it is a row expression before comma and column expression after comma.

Create a dataset
Data test;
Input T1 T2 T3 T4 T5 Age BU;
Cards;
1 5 2 3 4 3 3
4 5 2 1 2 1 3
3 4 4 3 2 3 2
4 3 2 5 3 3 3
1 2 4 2 1 2 2
;
Run;
Simple Table
Proc tabulate data = test;
Var T1;                                                                                          
Table T1;          
Run;
The output is shown in the image below -


By default, it calculates SUM for variables.

How to add different Statistics Options

The asterisk * is used to add statistical keywords.

Suppose you want to calculate COUNT for T1 variable :
Proc tabulate data = test;
Var T1;
Table T1 * N;
Run;

Suppose you want to calculate both COUNT and SUM for T1 variable :
Proc tabulate data = test;
Var T1;
Table T1 * (N SUM);
Run;

Statistics Options


Cross Tab

In the TABLE statement, row expression is specified first then followed by comma and then column expression.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 * (N COLPCTN);
Run;

Transposed Format
Proc Tabulate Data = test;
Class Age;
Var T1;
Table T1, Age * (N ROWPCTN);
Run;

Label Variables and Statistics

The "=" equal sign is an operator most commonly used for formatting.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age, T1 = "Group I" * (N="Count" COLPCTN="%");
Run;
The above SAS program can be written as follows :
Proc Tabulate Data = test;
Class Age;
Var T1;                                                                              
Keylabel N="Count" COLPCTN="%";
Table Age, T1 = "Group I" * (N COLPCTN);
Run; 
The Keylabel statement will change the label of keywords. Both the above programs produce same output.



Shift the row header up

In order to hide variable or labels, you leave the label specification blank (i.e. =‘ ‘ ).
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age=" ", T1 = "Group I" * (N="Count" COLPCTN="%") / box="Age";
Run;


Adding total rows and columns 

The ALL keyword is used to generate a sum total for rows or columns.
Proc Tabulate Data = test;
Class Age;
Var T1;
Table Age ALL = "Grand Total" , T1 = "Group I" * (N="Count" COLPCTN="%");
Run;

Two level data cuts
Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;

In order to hide variable or statistic labels, you leave the label specification blank (i.e. =‘ ‘ ). 
Proc tabulate data = test;
Class Age BU;
Var T1;
Table T1="Group I",(Age=" " * BU="Business Unit") * (N="Count" ROWPCTN="%");
Run;

Cell Formats

You can use the asterisk *  to associate the format modifier “F=” to the summary statistic.
Proc tabulate data = test;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;

Format Statement with Proc Tabulate
Proc format;                                                                                                                          
value agefmt                                                                                                                          
1 = 'Under 18'                                                                                                                      
2 = '18 - 25'                                                                                                                        
3 = 'Over 25';                                                                                                                      
Run;                                                                                                                                  
                                                                                                                                     
Proc format;                                                                                                                          
value bufmt                                                                                                                          
1 = 'Analytics'                                                                                                                      
2 = 'Technology'                                                                                                                    
3 = 'Others';                                                                                                                        
Run;
                                                                                 
Proc tabulate data = test;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;                                                                                                                    
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run; 


Three Dimensional Table
Data test;
length BU $18.;
Input Location$ BU$ Gender$ Income;
Cards;
Delhi Analytics Male 5000
Mumbai Tech Female 45000
Delhi Analytics Male 37000
Chennai Tech Male 33000
Delhi Tech Male 5000
Chennai Analytics Male 15000
Mumbai Analytics Female 440000
Delhi Analytics Female 5000
Mumbai Tech Male 45000
Delhi Analytics Female 37000
Chennai Tech Female 33000
Delhi Tech Female 5000
Chennai Analytics Male 15000
;                                                                                                                                    
Run;                                                                         
Proc tabulate data = test F=6.0;
Class Location BU Gender;
Var Income;
Table Location=" "*BU=" ", Gender * Income=" "*(N="Count" ROWPCTN="%") / Box="Location BU";
Run;


Save output in a dataset
Proc tabulate data = test out=test1;
Format Age agefmt. BU bufmt.;
Class Age BU;
Var T1;
Keylabel N="Count" ROWPCTN="%";
Table T1="Group I",(Age * BU="Business Unit") * (N ROWPCTN * F=6.0);
Run;
ListenData Logo
Spread the Word!
Share
Related Posts
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and Human Resource.

11 Responses to "SAS : Proc Tabulate Explained"
  1. You are awesome buddy... very useful

    ReplyDelete
  2. Great!!!! very easy to understand

    ReplyDelete
  3. Keep up the good work, bro.

    ReplyDelete
  4. sir ,i have a doubt Iam attending the interview on next month ,what type of questions will ask for 1 year experience data analytists, i was learning sas,r,python pllllllllllllz tell me some guidence for us

    ReplyDelete
  5. I find listendata to be an extraordinary platform when it comes to- understanding any SAS/SQL related stuffs. The examples are presented beautifully. The progression of the complexity is very well maintained, which helps a user to understand the basic logic and relevant implementations first and progressively the more complex ones. Thank you for making life simpler :)

    ReplyDelete
  6. easy to understand, crisp and to the point, thanks a lot

    ReplyDelete
  7. thank you! great explanation

    ReplyDelete
  8. Thank you for this detailed and simple explanation of the proc tabulate procedure

    ReplyDelete

Next → ← Prev
Looks like you are using an ad blocker!

To continue reading you need to turnoff adblocker and refresh the page. We rely on advertising to help fund our site. Please whitelist us if you enjoy our content.