This tutorial demonstrates how to find a variable with the maximum or minimum value for each row (observation) in SAS. It's pretty straightforward to calculate max or min value but a little problematic to identify the variable name against the value.

Suppose you are asked to identify top 3 variables across rows. You can use largest function in SAS.

*Let's create Sample Data*data readin;

input y1-y6;

cards;

11 55 59 35 25 87

12 79 73 74 86 29

13 80 95 77 25 74

;

run;

**Find Variable Name Containing Max Value in a row**data out;

set readin;

array values y1-y6;

largest = max(of values[*]);

index = whichn(largest, of values[*]);

name = vname(values[index]);

proc print;

run;

Output |

**Explanation :**

**array values y1-y6 :**Lists all the variables for calculating max value**max() function**calculates maximum value of all the variables listed in step 1 across rows**whichn() function**returns the column index number of the matching value. In this case, it is searching the maximum value across rows and returns the column position in the listed variables. For example, it returns 6 in row 1 as 87 is the maximum value in row 1 and it is placed at 6th column of y1-y6**vname() function**returns the variable name. In this case, it calculates variable name of the largest value.

**To Find Variable Name containing minimum value**

Use min() function instead of max() in the code above and the remaining code would be same.

**Find Top 3 Variables**Suppose you are asked to identify top 3 variables across rows. You can use largest function in SAS.

**LARGEST Function : Syntax**largest(k, variables)

**k :**kth value you want (2 for the second largest value)data want;

set readin;

array values[*] y1-y6;

array large[3];

array names[3] $32;

do i = 1 to dim(large);

large[i] = largest(i,of values[*]);

index = whichn(large[i],of values[*]);

names[i] = vname(values[index]);

end;

drop i index;

proc print;

run;

SAS : LARGEST values |

**Explanation :**

**array values[*] y1-y6**- Specify all the variables from which you want to calculate top 3 variables**array large[3] -**Top 3 large values**array names[3] $32 -**Names of top 3 variables**do i = 1 to dim(large) -**3 iterations for calculating first, second and third largest values**large[i] = largest(i,of values[*]) -**largest value when i =1, second largest when i =2 and so on.**index = whichn(large[i],of values[*])**: Column index for kth largest values**names[i] = vname(values[index])**: Extract the variable name of the largest using largest index**drop i index; :**Dropping Irrelevant Variables

**Find Bottom 3 Variables**

Refer the code above and change largest() function to smallest() function.

I found this very useful

ReplyDeleteYour post is interesting - thank you for publishing it. Unfortunately, the algorithm is not very robust. Let's suppose that some of your values are missing or that two or more values are identical within an observation. You'll see it is not working properly. I tried to find a correction but unfortunately, I cannot.

ReplyDeleteBest regards,

Jean Hardy

SAS Consultant and trainer

Canada