4 ways to calculate LAG and LEAD in SAS

Live Online Training : SAS Programming with 50+ Case Studies

- Explain Programming Concepts in Simple English
- Live Projects
- Scenario Based Questions
- Job Placement Assistance
- Get 10% off till Sept 25, 2017
- Batch starts from October 8, 2017

This tutorial demonstrates multiple ways to calculate lag and lead in SAS. In time series data, it is generally required to calculate lag and lead of one or more measured variables. Time series or longitudinal data are considered one of the most challenging data manipulation tasks. This tutorial covers various data manipulation tricks to make it easy.

What is Lag ?
Looking back some number of periods or rows.
Lag of first order - Looking back 1 observation (last value). It is denoted by lag1_value in the image shown below.
Lag of second order - Looking back 2 observations. It is denoted by lag2_value in the image shown below.
Calculate Lag in SAS

Create a Sample Data

The program below creates a sample data for demonstration.
data example;
input id value;
cards;
1 25
1 30
1 35
1 40
2 25
2 30
2 37
;
run;
 LAG Function

In SAS, the LAG function is used to compare the current value to its predecessors. If you want to calculate lag of second order, use LAG2 function. Similarly, you can use LAG3 function for measuring lag of third order.
data temp;
set example;
lag1_value = lag(value);
lag2_value = lag2(value);
run;

Calculating LAG by ID Variable

Suppose you are asked to calculate LAG of first order by a grouping variable. In this case, there is no SAS function directly available to accomplish this task. We need to apply some tricks to make it happen.

Method I : LAG and FIRST.

The program below uses BY statement and FIRST. to measure lag of first order by ID. First, we are calculating lag and then make the value missing against the first record in a group i.e. ID.
data temp;
set example;
by id;
lag_value = lag(value);
if first.id then lag_value = .;
run;
Lag by Group

Method 2 : POINT=, NOBS and _N_

The POINT= Option is used to point to a specified row number. The _N_ system variable is used to generate a serial number which are used as an index. The NOBS option is used to calculate the number of rows in a variable.
data example;
set example;
lagid = lag(id);
run;

data lags;
 if _n_ <= k then do;
 set example point = _n_;
 if lagid = id then do;
 N = _N_ - 1;
 set example point = N;
 lag_value = value;
 end;
 end;
 else lag_value = .;
 set example nobs = k;
 by id;
 drop lagid;
 run;

Method 3 : PROC SQL and FIRST.

In the program below, we are first creating a sequence of values and later we are joining two tables taking primary key as current value from the first table with predecessor value from the second table.
data example;
set example;
by id;
if first.id then N = 1;
else N + 1;
run;
proc sql;
select a.* , b.value as lag_value
from example a left join example b
on a.id = b.id and a.N = b.N + 1;
quit;

Method 4 : PROC EXPAND

PROC EXPAND is one of the most useful procedure of SAS ETS. It has multiple functions such as creating lag, lead and moving average variables by group(s), aggregation of previous k rows etc.
proc expand data= example out= t2;
convert value = lag_value /transformout = (lag 1);
by id;
run;
It requires license of SAS ETS package.

What is Lead?
Looking ahead some number of rows or periods.
Lead Value of First Order - Looking ahead one observation of  a variable (next row).
Lead Value of Second Order - Looking ahead two observations of  a variable (next to next row).

Lead Value

Calculating Lead in SAS

In SAS, there is no direct function for calculating LEAD. I feel the same there should be a SAS function for it :-) It can be calculated with the following logic :

Create a sequence of numbers and then sort the sequence by descending order. Then we calculate lag of the variable for which we need to calculate lead. At last, we sort the data by sequence ID.

data temp;
set example;
x + 1;
run;

proc sort data = temp;
by descending x;
run;

data temp2;
set temp;
lead_value = lag(value);
run;

proc sort data = temp2;
by x;
run;


Method 2 : POINT=, _N_ and NOBS

The NOBS option counts the number of rows in a variable. The _N_+ 1 creates a sequence of numbers start from 2 to (number of records + 1). The POINT= points to a row when the sequence of numbers are less than or equal to number of rows.
data temp;
_N_+1;
if _N_ <= k then do;
set example point=_N_;
lead_value = value;
end;
else lead_value = .;
set example nobs=k;
proc print noobs;
run;
Calculating Lead by ID Variable

Method I : FIRST. and PROC SORT

In this case, we are applying the same trick that we have used above to calculate lead. The only difference is we need to incorporate grouping variable to calculate lead by a grouping variable (ID).
data temp;
set example;
by id;
if first.id then x =1;
else x + 1;
run;

proc sort data = temp;
by id descending x;
run;

data temp2;
set temp;
by id descending x;
l = lag(x);
lead_val = lag(value);
if first.id then do;
l = .;
lead_val = .;
end;
run;

proc sort data = temp2;
by id x;
run;
Lead Variables

Method II : FIRST. and PROC SQL

In the program below, we are performing self joins on next value with the current value.
data example;
set example;
by id;
if first.id then N = 1;
else N + 1;
run;

proc sql;
select a.* , b.value as lag_value
from example a left join example b
on a.id = b.id and a.N + 1 = b.N;
quit;

Method III : POINT= and NOBS

In the program below, we have used the same POINT= and NOBS trick that we have used in the above article.
data example;
 set example;
 lagid = lag(id);
run;  
data leads;
 _n_ + 1;
 if _n_ <= n then do;
 set example point=_n_;
 if lagid = id then do;
 y = value;
 end;
 end;
 else y = .;
 set example nobs = n;
 by id;
 drop lagid;
 proc print noobs;
 run;

Method IV : PROC EXPAND

PROC EXPAND can be used to calculate leads by a grouping variable. It can also calculate moving averages.
proc expand data= example out= t2;
convert value = lead_value /transformout = (lead 1);
by id;
run;
The above method works only if you have a license of SAS ETS package.

SAS Tutorials : 100 Free SAS Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

0 Response to "4 ways to calculate LAG and LEAD in SAS"

Post a Comment

Next → ← Prev