Dropping variables from a data set in SAS

This post explains how to drop variables from a dataset in SAS. It includes various tricks to delete variables from data.In SAS, there are two ways to handle dropping variables :
  • DROP = data set option
  • DROP statement
Let's start with creating a data set :


The main differences between the two are as follows :
I.  Scenario : Create a new variable based on existing data and then drops the irrelevant variables

By using the DROP statement, we can command SAS to drop variables only at completion of the DATA step.
data readin;
set outdata;
totalsum = sum(obs1,obs2,obs3);
drop obs1 obs2 obs3;
run;
In the above example, we simply ask SAS sum up all the values in variables obs1,obs2 and obs3 to produce a new variable totalsum and then drop the old variables obs1,obs2 and obs3.

Consequence of using DROP = Option

data readin;
set outdata (drop = obs1 obs2 obs3);
totalsum = sum(obs1,obs2,obs3);
run;

The variables obs1,obs2 and obs3 are not available for use after data set outdata has been copied into the new data set readin . Hence totalsum would contain missing values only.


II. DROP statement can be used anywhere in DATA steps whereas DROP = option must follow the SET statement.

DROP statement

data readin;
set outdata;
if gender = 'F';
drop age;
run;
OR
data readin;
set outdata;
drop age;
if gender = 'F';
run;

DROP = option

data readin;
set outdata (drop = age);
if  gender = 'F';
run;

III. Scenario : Dropping variables while printing

DROP statement can be used in DATA steps only whereas DROP = option can be used in DATA steps and PROC steps (for printing)
proc print data = outdata (drop = age);
where gender = 'F';
run;

SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like banking, Telecom, HR and Health Insurance.

While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*
Related Posts:
15 Responses to "Dropping variables from a data set in SAS"
  1. DROP = option


    data readin;

    set outdata (drop = age height);
    if gender = 'F';

    run;


    if it will drop age and height first then how the if condition will be satisfied ? little confused.Will appreciate if you help.

    ReplyDelete
    Replies
    1. it will not drop the vaiables age and height when the data statement is run but it will be droped while writing to the data set in the pdv.
      and if ur data contained a attribute as gender then u can drop other variable for using any conditional statement .\

      Delete
  2. we command to SAS print data from outdata and drop age height where gender (F)
    i dint c any condition will be not satisfied

    ReplyDelete
  3. In that case age and height of only females will come.

    ReplyDelete
  4. hello, does keep statement and option act exactly like drop counterpart?
    thanks

    ReplyDelete
  5. Hi,
    Just one correction. Created data set does not contain any height variable.
    Please update accordingly.
    Thanks for your great work.

    ReplyDelete
  6. Hi,
    We can use drop statement in data step also right?

    data readin (drop = obs1 obs2 obs3);
    set outdata ;
    totalsum = sum(obs1,obs2,obs3);
    run;

    ReplyDelete
  7. Thanks for detailed tutorial
    Really helpful

    ReplyDelete
  8. I am getting an error after running below mention code.. Pls advise anyone
    Proc print data = mylib. import (Keep=Region Plan Actual) (obs=10);
    Run;

    ReplyDelete
  9. As per my knowledge, DROP option can be used after data statement too.

    ReplyDelete
  10. After droping a variable from a dataset can we alter those variable?
    totalsum = sum(obs1,obs2,obs3);
    drop obs1 obs2 obs3;
    /*Now is this possible!*/newsum=2*totalsum

    ReplyDelete
    Replies
    1. yes,
      totalsum = sum(obs1,obs2,obs3);
      drop obs1 obs2 obs3;
      newsum=2*totalsum;
      As you can see that here drop is used as data statement so pdv is generated for those variable, so we can modify or delete those variable.

      Delete

Next → ← Prev