SAS : Read Character Variable of Varying Length

Live Online Training : SAS Programming with 50+ Case Studies

- Explain Programming Concepts in Simple English
- Live Projects
- Scenario Based Questions
- Job Placement Assistance
- Get 10% off till Sept 25, 2017
- Batch starts from October 8, 2017

This tutorial demonstrates how we can read or import data with a character variable of varying length. We generally encounter this situation when we have company names or both first and last names of a person in our dataset.

Example I

In the following example, the variable "Name" has varying length i.e. not all observations of this variable has similar length.

Example Dataset
Read Messy Data

Method I : Use COLON Modifier

We can use colon modifier : to tell SAS to read variable "Name" until there is a space or other delimiter. The  $30. defines the variable as a character variable having max length 30.
data example1;
input ID Name :$30. Score;
cards;
1 DeepanshuBhalla 22
2 AttaPat 21
3 XonxiangnamSamnuelnarayan 33
;
proc print noobs;
run;
The colon modifier is also used to read numeric data that contains special characters such as comma For example 1,000.


Method II : Use LENGTH statement prior to INPUT Statement

In the following program, we use a length statement prior to input statement to adjust varying length of a variable. In this case, the variable Name would be read first. Use only $ instead of $30. after "Name" in INPUT statement.
data example2;
length Name $30.;
input ID Name $ Score;
cards;
1 DeepanshuBhalla 22
2 AttaPat 21
3 XonxiangnamSamnuelnarayan 33
;
proc print noobs;
run;
Output
It changes the order of variables as the variable Name would be read first. 

Method III : Use Ampersand (&) and Put Extra Space

We can use ampersand (&) to tell SAS to read the variable until there are two or more spaces as a delimeter. This technique is very useful when the variable contains two or more words. For example, if we have observation like "Deepanshu Bhalla" rather than "DeepanshuBhalla".

Note : 2 spaces before 22, 21 and 33
data example1;
input ID Name & $30. Score;
cards;
1 DeepanshuBhalla  22
2 AttaPat  21
3 XonxiangnamSamnuelnarayan  33
;
proc print noobs;
run;

Example II : When a variable contains more than 1 word

In this case, we have a space between First Name and Last Name and we want to store both the first and last names in a single variable.

Example 2 : Read Messy Data

In this case, the following methods do not work.

  1. Colon modifier (:) does not work for a variable having multiple words
  2.  LENGTH Statement prior to INPUT Statement does not work here.

Use Ampersand (&) and add ADDITIONAL space works.
data example1;
input ID Name & $30. Score;
cards;
1 Deepanshu Bhalla  22
2 Atta Pat  21
3 Xonxiangnam Samnuelnarayan  33
;
proc print noobs;
run;

This trick works in reading data from external file.
data temp;
infile "C:\Users\Deepanshu\Desktop\file1.txt";
input ID Name & $30. Score;
proc print noobs;
run;

SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

20 Responses to "SAS : Read Character Variable of Varying Length"

  1. Nice tips to resolve time consuming issues for SAS beginners

    ReplyDelete
  2. Hi,

    It it possible to get all sas tuturials as an PDF file ? :)

    ReplyDelete
  3. really learnt a new thing here.....
    appreciating your efforts
    thanx

    ReplyDelete
  4. I appreciate your efforts in explainnig this...
    Thanks.

    ReplyDelete
  5. data read;
    input cc spent;
    cards;
    cc spend
    1 100
    1 200
    1 550
    1 100
    1 200
    1 550
    1 100
    2 200
    2 550
    2 200
    2 200
    2 550
    2 200
    2 900
    3 750
    3 550
    3 1300
    3 1900
    3 750
    ;
    run;

    this code is giving error could you please tell me why?

    ReplyDelete
    Replies
    1. Hi , you have created numeric variable as cc and spent and you are passing character value in your first line of cards("cc", "spend".)

      Delete
  6. In this senario name variable having space between first variable and last variable how can we read the data normally we r using & but it's NT working can u guys tell me
    Ex
    Data student;
    Input studid studname$ rank;
    Cards;
    101 Rajkumar varma 20
    102 Rajesh 23
    103 Manojkumar p 19
    104 saravanakumar prudhvi 21
    Run;
    Can u tell me like this data how can we read please explain me

    ReplyDelete
    Replies
    1. Please use below code. It should work.
      Data student;
      Input studid studname & $30. rank;
      Cards;
      101 Rajkumar varma 20
      102 Rajesh 23
      103 Manojkumar p 19
      104 saravanakumar prudhvi 21
      ;
      proc print noobs;
      Run;

      Delete
    2. Just don't forget to put 2 spaces before numbers 20, 23, 19, 21.

      Delete
    3. As you are having space in name so you have to use & to read it

      Delete
  7. In this senario name variable having space between first variable and last variable how can we read the data normally we r using & but it's NT working can u guys tell me
    Ex
    Data student;
    Input studid studname$ rank;
    Cards;
    101 Rajkumar varma 20
    102 Rajesh 23
    103 Manojkumar p 19
    104 saravanakumar prudhvi 21
    Run;
    Can u tell me like this data how can we read please explain me

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Use below code
      Data student;
      Input studid studname& $21. rank;
      Cards;
      101 Rajkumar varma 20
      102 Rajesh 23
      103 Manojkumar p 19
      104 saravanakumar prudhvi 21
      Run;
      I have given double space between studid and rank

      Delete

  8. I need the output of lastname of this type of data:


    data ss;
    input name$ 40.;
    cards;
    Shanmugam ram anand
    vadi vel raja kumar
    ram jaya
    ravi
    SERVICIOS PROTEXA CONSTRUCTION
    ;
    run;
    proc print;
    run;




    i need output as follows
    output:
    anand
    kumar
    jaya
    construction

    please suggest the sas programme to get this output.

    ReplyDelete
    Replies
    1. data sub;
      set ss;
      e= scan(name,-1,'');
      name=e;
      keep name;
      run;

      Delete
  9. This comment has been removed by the author.

    ReplyDelete
  10. In Example 1 where there are no spaces between first and last name, we can also use truncover to avoid the problem of SAS reading next variable when the variable length is less than passed in input statement. Correct me if I am wrong.

    ReplyDelete

Next → ← Prev