This tutorial explains how to use retain statement in SAS. In SAS, it's a very easy and useful way to retain values with RETAIN statement.
Create Sample Data
The following program creates a sample data for demonstration -
The RETAIN statement simply copies retaining values by telling the SAS not to reset the variables to missing at the beginning of each iteration of the DATA step. If you would not use retain statement then SAS would return missing at the beginning of each iteration.
Generate Serial Number
Suppose you need to generate a serial number (or row index number) with data step.
Output Dataset
Cumulative Score
Suppose you need to calculate cumulative score. In financial data, we generally need to calculate cumulative score year to date.
Generate Serial Number by Group
Suppose you have a grouping variable say "region" and you need to generate a row index number by region.
Number of Unique Observations
The number of unique rows by a group can easily be calculated with PROC FREQ and PROC MEANS. The following program explains how we can calculate number of observations in a categorical variable with Data Step.
Suppose you have more than 1 grouping variable
Create Sample Data
The following program creates a sample data for demonstration -
data abcd;Uses of RETAIN Statement
input x y;
cards;
1 25
1 28
1 27
2 23
2 35
2 34
3 25
3 29
;
run;
The RETAIN statement simply copies retaining values by telling the SAS not to reset the variables to missing at the beginning of each iteration of the DATA step. If you would not use retain statement then SAS would return missing at the beginning of each iteration.
The retain statement keeps the value once assigned.
Generate Serial Number
Suppose you need to generate a serial number (or row index number) with data step.
data aaa;We can retain implicitly by using the +1 notation.
set abcd;
retain z 0;
z = z + 1;
run;
data aaa;
set abcd;
z + 1;
run;
Output Dataset
![]() |
Output Data Set |
Suppose you need to calculate cumulative score. In financial data, we generally need to calculate cumulative score year to date.
data aaa;
set abcd;
retain z 0;
z = z + y;
run;
![]() |
Output Data Set |
Suppose you have a grouping variable say "region" and you need to generate a row index number by region.
proc sort data = abcd;
by x;
run;
data aaa;
set abcd;
retain z;
if first.x then z = 1;
else z = z + 1;
by x;
run;
![]() |
SAS : Retain Statement |
Cumulative Score by Group
Suppose you need to calculate cumulative sale by product categories.
Suppose you need to calculate cumulative sale by product categories.
data aaa1;
set aaa;
retain z1;
if first.x then z1 = y;
else z1 = z1 + y;
by x;
run;
The number of unique rows by a group can easily be calculated with PROC FREQ and PROC MEANS. The following program explains how we can calculate number of observations in a categorical variable with Data Step.
data aaa2;
set abcd (drop = y);
retain z;
if first.x then z = 1;
else z = z + 1;
by x;
if last.x then output;
run;
![]() |
Unique Count |
Suppose you have more than 1 grouping variable
data temp;
input ID ID1 Score;
cards;
1 1 25
1 1 26
1 2 27
1 2 29
2 1 28
2 1 29
2 2 31
;
run;
data temp2;When you have more than 1 grouping variable, we can use multiple FIRST. statements with OR operator to generate serial numbers.
set temp;
by ID ID1;
if first.ID or first.ID1 then N = 1;
else N+1;
proc print;
run;
Hi,
ReplyDeleteThank you for making it so easy to understand took me just 1 min to understand the coding...great work brother.
Also can you please help me how can i prepare better for my base sas exam as it due in dec.
Check out this link - Base SAS Certification Questions and Answers
DeleteCan you explain...how to find out the middlest observation in a data set?????
DeletePlease....
Hi,
ReplyDeletethank you for your clear information.
please can you give me the answer for this question.
how can we find out cumulative totals for salary on each by group of gender?
Check the code below -
Deletedata abcd;
input Gender$ Salary;
cards;
M 25
M 26
M 27
F 23
F 24
F 25
;
run;
proc sort data = abcd;
by gender;
run;
data temp;
set abcd;
by gender;
retain tot ;
if first.gender then tot = salary;
else tot = tot + salary;
if last.gender;
drop salary;
run;
Hope it helps!
data ds;
Deleteset sashelp.class;
run;
proc sort data=ds;
by sex;
run;
data ds;
set ds;
by sex;
retain cum_height 0 cum_weight 0;
if first.sex then cum_height=height;
else cum_height=cum_height+height;
if first.sex then cum_weight=weight;
else cum_weight=cum_weight+weight;
run;
i think this may help you
This is not what i want.
ReplyDeletePlease post your sample data with desired output. It's hard to guess the requirement. Thanks!
DeleteHi Deepanshu,
ReplyDeleteCould you please share Advance sas certification questions and answers as well..
Thanks!!
Can you tell me
ReplyDeletehow to create data lines in missing values?
seems the sort is not a must.
ReplyDeleteHello
ReplyDeleteI am doing simpler by considering only one date to sort between every two index equal 1 for each patient
thanks for any suggestion to get the last colomn (end_period =minmum of dates) automatlically
Obs ENROLID index Switching_date end_periode
1 1110002 1 06/20/2013
2 1110002
3 1110002 06/20/2013
4 1110002 06/27/2013
5 1110002 07/22/2013
6 1110002
7 1110002 08/27/2013
8 1110002 09/12/2013
9 1110002
10 1110002 1 02/18/2014
11 1110002 02/18/2014
12 1110002
13 1110002 04/23/2014
14 1110002 04/23/2014
15 1110002
16 1110002 06/13/2016
17 1304303 1 11/23/2012
18 1304303 11/23/2012
19 1304303
20 1304303 1 01/07/2015 07/22/2013
21 1304303 1 07/22/2013 03/03/2018
22 1304303 03/03/2018
23 1304303 06/13/2018
24 1304303 08/25/2018