This tutorial explains how to calculate rank for one or more numeric variables with

Suppose you need to assign the largest value of a variable as rank 1 and the last rank to the lowest value. The

Suppose you need to split the variable into

Suppose you need to calculate rank by a grouping variable. To accomplish this task, you can use the

Let's create a sample dataset. See the variable score having same values

**PROC RANK**. In SAS, there are multiple ways to calculate rank overall or by a grouping variable. In data step, it can be done via retain statement. SAS made it easy to compute rank with**PROC RANK**.**Create Sample Data**data temp;

input ID Gender $ Score;

cards;

1 M 33

2 M 94

3 M 66

4 M 46

5 F 92

6 F 95

7 F 18

8 F 11

;

run;

**Compute rank of numeric variable - "Score"**

proc rank data= temp out = result;

var Score;

ranks ranking;

run;

**Notes :**

- The
**OUT**option is used to store output of the rank procedure. - The
**VAR**option is used to specify numeric variable (s) for which you want to calculate rank - The
**RANKS**option tells SAS to name the rank variable - By default, it calculates rank in ascending order.

**Reverse order of ranking (Descending)**

Suppose you need to assign the largest value of a variable as rank 1 and the last rank to the lowest value. The

**descending**keyword tells SAS to sort the data in descending order and assign rank to the variable accordingly.

proc rank data= tempdescendingout = result;

var Score;

ranks ranking;

run;

**Percentile Ranking (Quartile Rank)**

Suppose you need to split the variable into

**four**parts, you can use the

**groups option**in PROC RANK. It means you are telling SAS to assign only 4 ranks to a variable.

proc rank data= temp descendinggroups = 4out = result;

var Score;

ranks ranking;

run;

**Note :**

GROUPS=4 forquartile ranks, and GROUPS=10 fordecile ranks, GROUPS = 100 forpercentile ranks.

**Ranking within BY group (Gender)**

Suppose you need to calculate rank by a grouping variable. To accomplish this task, you can use the

**by statement**in proc rank. It is required to sort the data before using by statement.

proc sort data = temp;

by gender;

run;

proc rank data= temp descending out = result;

var Score;

ranks ranking;

by Gender;

run;

**How to compute rank for same values**

Let's create a sample dataset. See the variable score having same values

**(33 appearing twice).**

data temp2;Specify option

input ID Gender $ Score;

cards;

1 M 33

2 M 33

3 M 66

4 M 46

;

run;

**TIES =**HIGH | LOW | MEAN | DENSE in PROC RANK.

proc rank data= temp2ties = denseout = result;

var Score;

ranks rank_dense;

run;

**LOW -**assigns the smallest of the corresponding ranks.**HIGH -**assigns the largest of the corresponding ranks.**MEAN -**assigns the mean of the corresponding ranks**(Default Option).****DENSE -**assigns the smallest of the corresponding rank and add +1 to the next rank (don't break sequence)

**See the comparison between these options in the image below -**SAS : Handle Ties in PROC RANK |

super good..keep going..

ReplyDeletevery good and easy explanation....

ReplyDeletePlease add predictive modelling steps...step by step.

I didn't get the concept of ties that you have explained in the last!

ReplyDeleteThank you for your feedback. I have added more description to the concept of ties. Hope it helps.

DeleteNice explanation... keep going guys.. Thanks!!

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteNice explanation..

ReplyDeleteHow can we get rank for character variables

ReplyDelete