SAS : Delete empty rows in SAS

Deepanshu Bhalla 16 Comments
Suppose you want to delete empty rows from a dataset in SAS. It generally happens when we import data from external sources such as excel / csv files. It loads additional rows that are totally blank. Sometimes blank observations also affect the desired output so it's necessary to check missing cases and treat them.

Sample Dataset
The sample dataset looks like below. In the dataset, we have four variables - 1 character and 3 numeric. It would be used further in the example to demonstrate how to remove empty rows.
NameScore1Score2Score3
Sam776866
Deepanshu50
89




Shane557889
Roger509786




Priya886893

Create a SAS dataset

The following program creates a sample dataset in SAS.
data outdata;
length Name $12.;
input Name $ Score1 Score2 Score3 ;
infile datalines missover;
datalines;
Sam 77 68 66
Deepanshu 50 . 89
Shane 55 78 89
Roger 50 97 86
Priya 88 68 93
;
run;

Method I : Remove rows where all variables having missing / blank values
options missing = ' ';
data readin;
   set outdata;
   if missing(cats(of _all_)) then delete;
run;

Notes :
1. The MISSING= system option is used to display the missing values as a single space rather than as the default period (.) options missing = ' ';

2. The CATS function concatenates the values. It also removes leading and trailing blanks. cats(of _all_) - Concatenate all the variables

3.  missing(cats(of _all_)) - Identifies all the rows in which missing values exist in all the variables.
Output

In the output dataset, we have 5 rows. One row is deleted from the original dataset.
SAS : Delete empty rows from a dataset

Method 2 

In this program, NMISS function checks the numeric of missing numeric values and CMISS checks the number of missing character values. In this code, we are telling SAS to delete records wherein both the character and numeric values are missing.
data readin;
set outdata;
if nmiss( of _numeric_ ) and cmiss(of _character_) then delete ;
run;

Method 3

In the code below, we are using COALESCE and COALESCEC functions to return non-empty rows. These functions return first non-missing value. If all values are missing, it returns missing. Later we are checking whether these functions return missing or not.
data readin;
set outdata;
if missing(coalescec(of _character_)) and missing(coalesce(of _numeric_)) then delete;
run;

Example : Delete rows where any variable has missing values
data readin;
set outdata;
if nmiss(of _numeric_) OR  cmiss(of _character_) > 0 then delete;
run;
In this case, we are using OR operator to check if any of the variable has missing values. It returns 4 observations. Check out the output below -

Output

Practice Questions - Try it yourself

Q1. Remove records wherein all numeric columns having empty rows.
Q2. Remove records wherein all character variables having empty rows.

Post your answer in the comment box below -
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

16 Responses to "SAS : Delete empty rows in SAS"
  1. Because Name is your 'key', your code could be written more succinctly as:
    data readin;
    set outdata;
    where not missing( Name);
    run;

    But, if you got records with a Name, but no Scores, you could write it as:
    data readin;
    set outdata;
    where n( of Score1-Score3) > 0 ;
    run;

    Cheers, :)

    ReplyDelete
  2. But if i want to delete rows only when missing values exist in all the variables ;)

    ReplyDelete
  3. Anytime you change the Options, please do take care to change it back so that later code doesn't change its behavior.

    ReplyDelete
  4. options missing = ' ';
    data readin;
    set outdata;
    if missing(cats(of _all_)) then delete;
    run;
    when we run this code i am getting an error that _all_ is uninitialise

    ReplyDelete
    Replies
    1. It is working fine at my end. Which version of SAS you are using?

      Delete
  5. my question--

    data abc;
    input name$ sex$ age;
    datalines;
    tushar m 26
    dhiru m 27
    twinkle f 20
    ankita f 22
    goldi f 24
    nikhil m 25
    ;

    i want to arrange observations by sex alternatively like 1st m 2nd f 3rd m 4th f
    5th m 6th f.
    how i do it ????
    plz tell me ans.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. proc sort data = abc;
      by sex age;
      run;

      data abc1;
      set abc;
      retain counter;
      if first.sex then counter = 1;else counter = counter + 1;
      by sex;
      ranking = strip(input(counter,z9.) || sex);
      drop counter;
      run;

      Delete
    3. This comment has been removed by the author.

      Delete
    4. proc sort data = abc;
      by sex age;
      run;

      data abc1;
      set abc;
      by sex;
      retain counter;
      if first.sex then counter = 1;else counter = counter + 1;
      run;

      proc sort data = abc1;
      by descending sex ;
      run;

      proc sort data = abc1;
      by counter;
      run;

      @ashwani

      Delete
  6. To delete blank rows in any dataset-
    data a;
    set outdata;
    if cmiss(of _all_) = 4 then delete; // 4 is number of variables in data
    run;

    ReplyDelete
  7. options missing = ' ';
    data readin;
    set outdata;
    if missing(cats(of _numeric_)) then delete;
    run;

    this is to delete rows where all numeric columns have missing values.

    please comment if the answer is right or wrong

    ReplyDelete
  8. What can I do if want to delete all columns from dataset having blank values

    ReplyDelete
  9. I want to know how to reduce the length of particular column based on the maximum length of the value in that particular column. e.g if i assigned column a as 200 length but i have maximum length of the value present in it as 20 .

    ReplyDelete
    Replies
    1. The below query will give the max length of the variable data.
      proc sql; select max(length(forename)) from have; quit;

      Delete
  10. data readin;
    set outdata;
    if missing(coalescec(of _char_)) then delete;
    run;

    data readin1;
    set outdata;
    if missing(coalesce(of _numeric_)) then delete;
    run;

    ReplyDelete
Next → ← Prev