This tutorial explains how to calculate rank for one or more numeric variables with

Suppose you need to assign the largest value of a variable as rank 1 and the last rank to the lowest value. The

Suppose you need to split the variable into

Suppose you need to calculate rank by a grouping variable. To accomplish this task, you can use the

Let's create a sample dataset. See the variable score having same values

**PROC RANK**. In SAS, there are multiple ways to calculate rank overall or by a grouping variable. In data step, it can be done via retain statement. SAS made it easy to compute rank with**PROC RANK**.**Create Sample Data**data temp;

input ID Gender $ Score;

cards;

1 M 33

2 M 94

3 M 66

4 M 46

5 F 92

6 F 95

7 F 18

8 F 11

;

run;

**Compute rank of numeric variable - "Score"**

proc rank data= temp out = result;

var Score;

ranks ranking;

run;

**Notes :**

- The
**OUT**option is used to store output of the rank procedure. - The
**VAR**option is used to specify numeric variable (s) for which you want to calculate rank - The
**RANKS**option tells SAS to name the rank variable - By default, it calculates rank in ascending order.

**Reverse order of ranking (Descending)**

Suppose you need to assign the largest value of a variable as rank 1 and the last rank to the lowest value. The

**descending**keyword tells SAS to sort the data in descending order and assign rank to the variable accordingly.

proc rank data= tempdescendingout = result;

var Score;

ranks ranking;

run;

**Percentile Ranking (Quartile Rank)**

Suppose you need to split the variable into

**four**parts, you can use the

**groups option**in PROC RANK. It means you are telling SAS to assign only 4 ranks to a variable.

proc rank data= temp descendinggroups = 4out = result;

var Score;

ranks ranking;

run;

**Note :**

GROUPS=4 forquartile ranks, and GROUPS=10 fordecile ranks, GROUPS = 100 forpercentile ranks.

**Ranking within BY group (Gender)**

Suppose you need to calculate rank by a grouping variable. To accomplish this task, you can use the

**by statement**in proc rank. It is required to sort the data before using by statement.

proc sort data = temp;

by gender;

run;

proc rank data= temp descending out = result;

var Score;

ranks ranking;

by Gender;

run;

**How to compute rank for same values**

Let's create a sample dataset. See the variable score having same values

**(33 appearing twice).**

data temp2;Specify option

input ID Gender $ Score;

cards;

1 M 33

2 M 33

3 M 66

4 M 46

;

run;

**TIES =**HIGH | LOW | MEAN | DENSE in PROC RANK.

proc rank data= temp2ties = denseout = result;

var Score;

ranks rank_dense;

run;

**LOW -**assigns the smallest of the corresponding ranks.**HIGH -**assigns the largest of the corresponding ranks.**MEAN -**assigns the mean of the corresponding ranks**(Default Option).****DENSE -**assigns the smallest of the corresponding rank and add +1 to the next rank (don't break sequence)

**See the comparison between these options in the image below -**SAS : Handle Ties in PROC RANK |

super good..keep going..

ReplyDeletevery good and easy explanation....

ReplyDeletePlease add predictive modelling steps...step by step.

I didn't get the concept of ties that you have explained in the last!

ReplyDeleteThank you for your feedback. I have added more description to the concept of ties. Hope it helps.

DeleteNice explanation... keep going guys.. Thanks!!

ReplyDelete