Extract last 4 characters / digits of value in SAS

This tutorial explains how to extract last n characters or numbers of a variable in SAS. In this tutorial, we will cover several cases in which we pull last 4 character or numeric values from a column. In MS Excel, it is easily possible with RIGHT() function but there is no-inbuilt function to do it in SAS. The RIGHT() function of SAS is used for something else i.e. it right aligns string or character value.

Example 1

Suppose you have a product ID in which last 4 characters refers to a product category so you are asked to pull product category information.
data example;
productID = "AB123ZX253";
run;

data example;
set example;
referenceid = substr(productID,length(productID)-3,4);
run;

The output is shown in the image below -
Get last N Characters
Explanation

The SUBSTR() function returns sub-string from a character variable.
= SUBSTR(character-variable, beginning-position, number-of-characters-to-pull)
The LENGTH() function returns the length of a character variable. In this case, it is 10 characters long. The calculated SUBSTR() function would work like below -
= SUBSTR(productID, 10-3, 4)
= SUBSTR(productID, 7, 4)

Example 2 : Handle missing while extracting

Suppose you have multiple product IDs in which some of them are missing.
data example2;
input productID $10.;
cards;
AB123ZX253

AB123ZX254
AB123ZX255
;
run;
When you apply SUBSTR() function in case of missing cases, it returns a note in log 'Invalid second argument'. See the image below

SUBSTR : Missing Cases
To workaround this problem, we can use SUBSTRN() which handles missing cases while extracting.
data example2;
set example2;
referenceid = substrn(productID,length(productID)-3,4);
run;
Example 3 : Get last N digits from a numeric variable

Suppose product ID variable is numeric.
data example3;
input productID;
cards;
12345253

12354234
12354235
;
run;
The SUBSTR() / SUBSTRN() function works only for character variables. In order to make it run for numeric variables, we first need to convert numeric variables to character variables with PUT() function.
data example3;
set example3;
x = put(productID, 10.);
referenceid = input(substrn(x,length(x)-3,4),8.);
drop x;
proc print noobs;
run;
Since SUBSTRN() returns character variable, we need to convert it to numeric with INPUT() function.

Another Method -
data example3;
set example3;
referenceid = input(substrn(productID,INT(LOG10(productID))+1-3,4),8.);
proc print noobs;
run;
How it works -

Step 1 : LOG10(1021) returns 3.009
Step 2 : INT(LOG10(1021)) makes it 3
Step 3 : Adding 1 to step2 makes it 4, which is result of number of digits in 1021.
ListenData Logo
Spread the Word!
Share
Related Posts
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and Human Resource.

22 Responses to "Extract last 4 characters / digits of value in SAS"
  1. This was very useful. Thanks so much!

    ReplyDelete
  2. but ..when numeric variable when used in character function ..implicit conversion from numeric to character happen ....why convert explicitly with put statement.

    ReplyDelete
  3. How can we find last position of two digit number

    ReplyDelete
  4. could anyone explain the meaning of -3.

    ReplyDelete
    Replies
    1. Second parameter in SUBSTR() refers to starting position. In the first example, it returns 7 after subtracting 3 from productID. End result would return 7th, 8th, 9th and 10th character. In total it returns last 4 characters.

      Delete
    2. -3 becoz we want last 4 digit so when we are doing lenght(productID) it gives 10 and 10-3 gives 4 that's why there is -3.

      Delete
    3. It gives the result from right to left

      Delete
    4. what if you have different lengths of a variable? in that case , how will you extract last 4 characters

      Delete
    5. If we want last 4 characters, then we would want the substr()'s second argument be the fourth character's position from right to left, which is the length of the string - 3; If we want the last 5, then it should be the length - 4.

      Delete
  5. Can I use substr (, -1,4) for getting last 4 characters?

    ReplyDelete
  6. what if the input variables has different length and we have to extract some characters from the string of different length

    ReplyDelete
  7. Can anyone answer how can I output the date in YYYY-MM-DD format if I have diferent tipe of dates?
    Example
    17.Jan/2013
    13-11-14
    3/12/16
    January 16 2019
    16 feb 2020

    ReplyDelete
    Replies
    1. haha i think recently u attend interview for hcl screening round

      Delete
    2. brother u got answer or not

      please let us know

      Delete
    3. data have;
      input char_date : $20. fmt : $20.;
      num_char=inputn(char_date,fmt);
      format num_char date9.;
      cards;
      12-08-2015 DDMMYY12.
      12/8/2016 MMDDYY12.
      05/25/2015 MMDDYY12.
      ;
      run;
      proc print;run;

      Delete
  8. how do i get the position for a numeric value?

    ReplyDelete
  9. I HAVE NEED FRIST FOUR STRINGS OF FRIST SPACE

    ReplyDelete
  10. data ab;
    input var1 var2 var3 var4 var5 var6 var7 var8 var9 var10;
    cards;
    1 2 3 4 5 6 7 8 9 10
    run;
    how can we convert the above 10 numeric varianbles into character variables by using arrays?

    ReplyDelete
  11. data ink2;
    input joindate;
    informat joindate :anydtdte.;

    cards;
    17.Jan/2013
    13-11-14
    3/12/16
    january 16 2019
    16feb2020
    ;
    proc print;
    run;

    can anyone answer this query am facing error

    ReplyDelete
    Replies
    1. data ink2;
      input joindate:anydtdte;

      cards;
      17.Jan/2013
      13-11-14
      3/12/16
      january 16 2019
      16feb2020
      ;
      proc print;
      run;

      Delete
  12. pradeep.malegave123@yahoo.com in this mail id how can i extract my last name. ie malegave. Please give solution in base sas.

    ReplyDelete

Next → ← Prev
Looks like you are using an ad blocker!

To continue reading you need to turnoff adblocker and refresh the page. We rely on advertising to help fund our site. Please whitelist us if you enjoy our content.