Extract last 4 characters / digits of value in SAS

Live Online Training : SAS Programming with Practical Examples

- Explain Programming Concepts in Simple English
- Live Projects
- Scenario Based Questions
- Mock Interview
- 100+ Base and Advanced SAS Certification Questions
- Get 10% off till Jan 22, 2018
- Batch starts from February 10, 2018

This tutorial explains how to extract last n characters or numbers of a variable in SAS. In this tutorial, we will cover several cases in which we pull last 4 character or numeric values from a column. In MS Excel, it is easily possible with RIGHT() function but there is no-inbuilt function to do it in SAS. The RIGHT() function of SAS is used for something else i.e. it right aligns string or character value.

Example 1

Suppose you have a product ID in which last 4 characters refers to a product category so you are asked to pull product category information.
data example;
productID = "AB123ZX253";

data example;
set example;
referenceid = substr(productID,length(productID)-3,4);

The output is shown in the image below -
Get last N Characters

The SUBSTR() function returns sub-string from a character variable.
= SUBSTR(character-variable, beginning-position, number-of-characters-to-pull)
The LENGTH() function returns the length of a character variable. In this case, it is 10 characters long. The calculated SUBSTR() function would work like below -
= SUBSTR(productID, 10-3, 4)
= SUBSTR(productID, 7, 4)

Example 2 : Handle missing while extracting

Suppose you have multiple product IDs in which some of them are missing.
data example2;
input productID $10.;

When you apply SUBSTR() function in case of missing cases, it returns a note in log 'Invalid second argument'. See the image below

SUBSTR : Missing Cases
To workaround this problem, we can use SUBSTRN() which handles missing cases while extracting.
data example2;
set example2;
referenceid = substrn(productID,length(productID)-3,4);
Example 3 : Get last N digits from a numeric variable

Suppose product ID variable is numeric.
data example3;
input productID;

The SUBSTR() / SUBSTRN() function works only for character variables. In order to make it run for numeric variables, we first need to convert numeric variables to character variables with PUT() function.
data example3;
set example3;
x = put(productID, 10.);
referenceid = input(substrn(x,length(x)-3,4),8.);
drop x;
proc print noobs;
Since SUBSTRN() returns character variable, we need to convert it to numeric with INPUT() function.

Another Method -
data example3;
set example3;
referenceid = input(substrn(productID,INT(LOG10(productID))+1-3,4),8.);
proc print noobs;
How it works -

Step 1 : LOG10(1021) returns 3.009
Step 2 : INT(LOG10(1021)) makes it 3
Step 3 : Adding 1 to step2 makes it 4, which is result of number of digits in 1021.

SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like banking, Telecom, HR and Health Insurance.

While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*
Related Posts:
1 Response to "Extract last 4 characters / digits of value in SAS"
  1. This was very useful. Thanks so much!


Next → ← Prev