SAS INDEX Function: Learn with Examples

Deepanshu Bhalla Add Comment

This article explains how to use the INDEX function in SAS, along with examples.

What does the INDEX Function do?

The INDEX function in SAS is used to return the position of the first occurrence of a substring within a character string.

The syntax of the INDEX Function is as follows -

INDEX(string, substring)
  • string: The string within which you want to find the substring.
  • substring: The substring you want to find within the string.

Let's generate a sample dataset for demonstration purposes.

data mydata;
input names $30.;
Raj Gates
Allen Lee
Dave Sandy
William Gates
Jon Jedi

In the example below we are using the INDEX function to find the position of the first occurrence of the string "Gates" in each observation of the variable "names".

data readin;
set mydata;
position = index(names, "Gates");
proc print;
Index Function in SAS

In the above SAS Program, we have created a new variable named position which stores the position of the first occurrence of the string "Gates". The INDEX function returns 0 when the variable does not contain value of "Gates".

How to Handle Case Sensitivity?

The INDEX function is case-sensitive which means it treats "gates", "Gates" and "GATES" as different substrings.
data readin;
set mydata;
position = index(names, "gates");
proc print;
Wrong Output
Index Function: Case-sensitive

The INDEX function returned 0 because it couldn't find 'gates' in the 'names' variable. 'Gates' does exist in the variable, but the function differentiates between uppercase 'G' and lowercase 'g'.

To fix this issue, we can convert the variable to lowercase using the LOWCASE function. This will result in "Gates" becoming "gates".

data readin;
set mydata;
position = index(lowcase(names), "gates");
proc print;
Correct Output
Case-sensitive Index Function

How to Handle Leading Spaces?

Let's say you have a variable that contains leading spaces. See the sample dataset below. Leading spaces cause a change in the position of a substring within the longer string.

data mydata;
input names $char15.;
peter smith
  peter doe
 peter johnson

To fix this, we can use STRIP function to remove leading (and trailing) spaces. Compare the values of two variables named "position" and "position2". The variable "position2" leverages the STRIP function whereas the variable "position" is without the STRIP function.

data readin;
set mydata;
position  = index(names, "peter");
position2 = index(strip(names), "peter");
proc print;
handle leading spaces in STRIP function

How to Filter Data using INDEX Function?

In this example we are using a dataset named CARS from SASHELP library. We want to select only those models that contain "convertible" in their names.

We added a WHERE statement to check if the substring "convertible" exists in the variable. If the result of the INDEX function is greater than 0, it means the substring is found and the record is included in the filtered dataset.

data readin;
where index(lowcase(model), 'convertible') > 0;
Related Posts
Spread the Word!
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "SAS INDEX Function: Learn with Examples"
Next → ← Prev