4 ways to calculate LAG and LEAD in SAS

Deepanshu Bhalla Add Comment

This tutorial demonstrates various ways to calculate lag and lead in SAS. In time series data, it is generally required to calculate lag and lead of one or more measured variables.

Time series or longitudinal data are considered one of the most challenging data manipulation tasks. This tutorial covers various data manipulation tricks to make it easy.

What is Lag?

Looking back some number of periods or rows.

Lag of first order - Looking back 1 observation (last value). It is denoted by lag1_value in the image shown below.

Lag of second order - Looking back 2 observations. It is denoted by lag2_value in the image shown below.

Calculate Lag in SAS
Calculate Lag in SAS
Create a Sample Data

The program below creates a sample data for demonstration.

data example;
input id value;
cards;
1 25
1 30
1 35
1 40
2 25
2 30
2 37
;
run;
LAG Function

In SAS, the LAG function is used to compare the current value to its predecessors. If you want to calculate lag of second order, use LAG2 function. Similarly, you can use LAG3 function for measuring lag of third order.

data temp;
set example;
lag1_value = lag(value);
lag2_value = lag2(value);
run;
Calculating LAG by ID Variable

Suppose you are asked to calculate LAG of first order by a grouping variable. In this case, there is no SAS function directly available to accomplish this task. We need to apply some tricks to make it happen.

Method I : LAG and FIRST.

The program below uses BY statement and FIRST. to measure lag of first order by ID. First, we are calculating lag and then make the value missing against the first record in a group i.e. ID.

data temp;
set example;
by id;
lag_value = lag(value);
if first.id then lag_value = .;
run;
Calculate Lag by Group in SAS
Lag by Group
Method 2 : POINT=, NOBS and _N_

The POINT= Option is used to point to a specified row number. The _N_ system variable is used to generate a serial number which are used as an index. The NOBSoption is used to calculate the number of rows in a variable.

data example;
set example;
lagid = lag(id);
run;

data lags;
if _n_ <= k then do;
set example point = _n_;
if lagid = id then do;
N = _N_ - 1;
set example point = N;
lag_value = value;
end;
end;
else lag_value = .;
set example nobs = k;
by id;
drop lagid;
run;
Method 3 : PROC SQL and FIRST.

In the program below, we are first creating a sequence of values and later we are joining two tables taking primary key as current value from the first table with predecessor value from the second table.

data example;
set example;
by id;
if first.id then N = 1;
else N + 1;
run;
proc sql;
select a.* , b.value as lag_value
from example a left join example b
on a.id = b.id and a.N = b.N + 1;
quit;
Method 4 : PROC EXPAND

PROC EXPAND is one of the most useful procedure of SAS ETS. It has multiple functions such as creating lag, lead and moving average variables by group(s), aggregation of previous k rows etc.

proc expand data= example out= t2;
convert value = lag_value /transformout = (lag 1);
by id;
run;

It requires license of SAS ETS package.

What is Lead?

Looking ahead some number of rows or periods.

Lead Value of First Order - Looking ahead one observation of a variable (next row).

Lead Value of Second Order -Looking ahead two observations of a variable (next to next row).

Lead in SAS
Lead Value
Calculating Lead in SAS

In SAS, there is no direct function for calculating LEAD. I feel the same there should be a SAS function for it :-) It can be calculated with the following logic :

Create a sequence of numbers and then sort the sequence by descending order. Then we calculate lag of the variable for which we need to calculate lead. At last, we sort the data by sequence ID.

data temp;
set example;
x + 1;
run;
proc sort data = temp;
by descending x;
run;

data temp2;
set temp;
lead_value = lag(value);
run;

proc sort data = temp2;
by x;
run;
Method 2 : POINT=, _N_ and NOBS

The NOBS option counts the number of rows in a variable. The _N_+ 1 creates a sequence of numbers start from 2 to (number of records+ 1).The POINT= points to a row when the sequence of numbers are less than or equal to number of rows.

data temp;
_N_+1;
if _N_ <= k then do;
set example point=_N_;
lead_value = value;
end;
else lead_value = .;
set example nobs=k;
proc print noobs;
run;
Calculating Lead by ID Variable

In this section, we will cover how to calculate lead by group.

Method I : FIRST. and PROC SORT

In this case, we are applying the same trick that we have used above to calculate lead. The only difference is we need to incorporate grouping variable to calculate lead by a grouping variable (ID).

data temp;
set example;
by id;
if first.id then x =1;
else x + 1;
run;

proc sort data = temp;
by id descending x;
run;

data temp2;
set temp;
by id descending x;
l = lag(x);
lead_val = lag(value);
if first.id then do;
l = .;
lead_val = .;
end;
run;

proc sort data = temp2;
by id x;
run;
SAS : Lead by Group
Lead Variables
Method II : FIRST. and PROC SQL

In the program below, we are performing self joins on next value with the current value.

data example;
set example;
by id;
if first.id then N = 1;
else N + 1;
run;

proc sql;
select a.* , b.value as lag_value
from example a left join example b
on a.id = b.id and a.N + 1 = b.N;
quit;
Method III : POINT= and NOBS

In the program below, we have used the same POINT= and NOBStrick that we have used in the above article.

data example;
set example;
lagid = lag(id);
run; 
data leads;
_n_ + 1;
if _n_ <= n then do;
set example point=_n_;
if lagid = id then do;
y = value;
end;
end;
else y = .;
set example nobs = n;
by id;
drop lagid;
proc print noobs;
run;
Method IV : PROC EXPAND

PROC EXPAND can be used to calculate leads by a grouping variable. It can also calculate moving averages.

proc expand data= example out= t2;
convert value = lead_value /transformout = (lead 1);
by id;
run;

The above method works only if you have a license of SAS ETS package.

Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "4 ways to calculate LAG and LEAD in SAS"
Next → ← Prev