This tutorial demonstrates multiple ways to calculate lag and lead in SAS. In time series data, it is generally required to calculate lag and lead of one or more measured variables. Time series or longitudinal data are considered one of the most challenging data manipulation tasks. This tutorial covers various data manipulation tricks to make it easy.

The program below creates a sample data for demonstration.

In SAS, the

Suppose you are asked to calculate LAG of first order by a grouping variable. In this case, there is no SAS function directly available to accomplish this task. We need to apply some tricks to make it happen.

The program below uses BY statement and FIRST. to measure lag of first order by ID. First, we are calculating lag and then make the value missing against the first record in a group i.e. ID.

The POINT= Option is used to point to a specified row number. The _N_ system variable is used to generate a serial number which are used as an index. The NOBS

In the program below, we are first creating a sequence of values and later we are joining two tables taking primary key as current value from the first table with predecessor value from the second table.

PROC EXPAND is one of the most useful procedure of SAS ETS. It has multiple functions such as creating lag, lead and moving average variables by group(s), aggregation of previous k rows etc.

In SAS, there is no direct function for calculating LEAD. I feel the same there should be a SAS function for it :-) It can be calculated with the following logic :

Create a sequence of numbers and then sort the sequence by descending order. Then we calculate lag of the variable for which we need to calculate lead. At last, we sort the data by sequence ID.

The NOBS option counts the number of rows in a variable. The _N_+ 1 creates a sequence of numbers start from 2 to (number of records + 1). The POINT= points to a row when the sequence of numbers are less than or equal to number of rows.

In this case, we are applying the same trick that we have used above to calculate lead. The only difference is we need to incorporate grouping variable to calculate lead by a grouping variable (ID).

In the program below, we are performing self joins on next value with the current value.

In the program below, we have used the same POINT= and NOBS trick that we have used in the above article.

**What is Lag ?**

Looking back some number of periods or rows.

**Lag of first order -**Looking back 1 observation (last value). It is denoted by lag1_value in the image shown below.**Lag of second order -**Looking back 2 observations. It is denoted by lag2_value in the image shown below.Calculate Lag in SAS |

**Create a Sample Data**The program below creates a sample data for demonstration.

data example;

input id value;

cards;

1 25

1 30

1 35

1 40

2 25

2 30

2 37

;

run;

**LAG Function**In SAS, the

**LAG function**is used to compare the current value to its predecessors. If you want to calculate lag of second order, use**LAG2 function.**Similarly, you can use**LAG3 function**for measuring lag of third order.data temp;

set example;

lag1_value = lag(value);

lag2_value = lag2(value);

run;

**Calculating LAG by ID Variable**Suppose you are asked to calculate LAG of first order by a grouping variable. In this case, there is no SAS function directly available to accomplish this task. We need to apply some tricks to make it happen.

**Method I : LAG and FIRST.**The program below uses BY statement and FIRST. to measure lag of first order by ID. First, we are calculating lag and then make the value missing against the first record in a group i.e. ID.

data temp;

set example;

by id;

lag_value = lag(value);

if first.id then lag_value = .;

run;

Lag by Group |

**Method 2 : POINT=, NOBS and _N_**The POINT= Option is used to point to a specified row number. The _N_ system variable is used to generate a serial number which are used as an index. The NOBS

**option is used to calculate the number of rows in a variable.**data example;

set example;

lagid = lag(id);

run;

data lags;

if _n_ <= k then do;

set example point = _n_;

if lagid = id then do;

N = _N_ - 1;

set example point = N;

lag_value = value;

end;

end;

else lag_value = .;

set example nobs = k;

by id;

drop lagid;

run;

**Method 3 : PROC SQL and FIRST.**

In the program below, we are first creating a sequence of values and later we are joining two tables taking primary key as current value from the first table with predecessor value from the second table.

data example;

set example;

by id;

if first.id then N = 1;

else N + 1;

run;

proc sql;

select a.* , b.value as lag_value

from example a left join example b

on a.id = b.id and a.N = b.N + 1;

quit;

**Method 4 : PROC EXPAND**

PROC EXPAND is one of the most useful procedure of SAS ETS. It has multiple functions such as creating lag, lead and moving average variables by group(s), aggregation of previous k rows etc.

proc expand data= example out= t2;

convert value = lag_value /transformout = (lag 1);

by id;

run;

**It requires license of SAS ETS package.**

**What is Lead?**

Looking ahead some number of rows or periods.

**Lead Value of First Order -**Looking ahead one observation of a variable (next row).

**Lead Value of Second Order -**Looking ahead two observations of a variable (next to next row).

Lead Value |

**Calculating Lead in SAS**

In SAS, there is no direct function for calculating LEAD. I feel the same there should be a SAS function for it :-) It can be calculated with the following logic :

Create a sequence of numbers and then sort the sequence by descending order. Then we calculate lag of the variable for which we need to calculate lead. At last, we sort the data by sequence ID.

data temp;

set example;

x + 1;

run;

proc sort data = temp;

by descending x;

run;

data temp2;

set temp;

lead_value = lag(value);

run;

proc sort data = temp2;

by x;

run;

**Method 2 : POINT=, _N_ and NOBS**

The NOBS option counts the number of rows in a variable. The _N_+ 1 creates a sequence of numbers start from 2 to (number of records + 1). The POINT= points to a row when the sequence of numbers are less than or equal to number of rows.

data temp;

_N_+1;

if _N_ <= k then do;

set example point=_N_;

lead_value = value;

end;

else lead_value = .;

set example nobs=k;

proc print noobs;

run;

**Calculating Lead by ID Variable**

**Method I : FIRST. and PROC SORT**

In this case, we are applying the same trick that we have used above to calculate lead. The only difference is we need to incorporate grouping variable to calculate lead by a grouping variable (ID).

data temp;

set example;

by id;

if first.id then x =1;

else x + 1;

run;

proc sort data = temp;

by id descending x;

run;

data temp2;

set temp;

by id descending x;

l = lag(x);

lead_val = lag(value);

if first.id then do;

l = .;

lead_val = .;

end;

run;

proc sort data = temp2;

by id x;

run;

Lead Variables |

**Method II : FIRST. and PROC SQL**

In the program below, we are performing self joins on next value with the current value.

data example;

set example;

by id;

if first.id then N = 1;

else N + 1;

run;

proc sql;

select a.* , b.value as lag_value

from example a left join example b

on a.id = b.id and a.N + 1 = b.N;

quit;

**Method III : POINT= and NOBS**

In the program below, we have used the same POINT= and NOBS trick that we have used in the above article.

data example;

set example;

lagid = lag(id);

run;

data leads;

_n_ + 1;

if _n_ <= n then do;

set example point=_n_;

if lagid = id then do;

y = value;

end;

end;

else y = .;

set example nobs = n;

by id;

drop lagid;

proc print noobs;

run;

**Method IV : PROC EXPAND**

PROC EXPAND can be used to calculate leads by a grouping variable. It can also calculate moving averages.

proc expand data= example out= t2;

convert value = lead_value /transformout = (lead 1);

by id;

run;

*The above method works only if you have a license of SAS ETS package.*
## Post a Comment