Understand Gain and Lift Charts

Gain and Lift charts are used to evaluate performance of classification model. They measure how much better one can expect to do with the predictive model comparing without a model. It's a very popular metrics in marketing analytics. It's not just restricted to marketing analysis. It can be used in other domains as well such as risk modeling, supply chain analytics etc. It also helps to find the best predictive model among multiple challenger models. In this tutorial, we will see how gain and lift metrics are calculated along with their interpretation.

Gain / Lift Analysis

Randomly split data into two samples: 70% = training sample, 30% = validation sample.
Score (predicted probability) the validation sample using the response model under consideration.
Rank the scored file, in descending order by estimated probability
Split the ranked file into 10 sections (deciles)
Number of observations in each decile
Number of actual events in each decile
Number of cumulative actual events in each decile
Percentage of cumulative actual events in each decile. It is called Gain Score.
Divide the gain score by % of data used in each portion of 10 bins. For example, in second decile, divide gain score by 20.

Gain and Lift Table

Gain

Gain at a given decile level is the ratio of cumulative number of targets (events) up to that decile to the total number of targets (events) in the entire data set

Interpretation:

% of targets (events) covered at a given decile level. For example, 80% of targets covered in top 20% of data based on model. In the case of propensity to buy model, we can say we can identify and target 80% of customers who are likely to buy the product by just sending email to 20% of total customers.

Lift

It measures how much better one can expect to do with the predictive model comparing without a model. It is the ratio of gain % to the random expectation % at a given decile level. The random expectation at the xth decile is x%.

Interpretation:

The Cum Lift of 4.03 for top two deciles, means that when selecting 20% of the records based on the model, one can expect 4.03 times the total number of targets (events) found by randomly selecting 20%-of-file without a model.

Gain and Lift Chart

Excel Template : Gain and Lift Charts

R Function : Gain and Lift Charts

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 14 Responses to "Understand Gain and Lift Charts"

AnonymousMarch 20, 2015 at 9:10 AM
Very helpful, thank you. But I disagree with your interpretation, however. It looks like the cumulative lift for the first two deciles is actually 4.03x the random data, and not 2.35x, correct? Or am I misinterpreting something?
AnonymousMay 6, 2015 at 1:20 AM
Excellent overview. Thank you.
AnonymousAugust 20, 2015 at 11:54 PM
Thank you Deepanshu for the detailed explanation.
UnknownSeptember 6, 2015 at 10:28 AM
Thanks alot Deepanshu.
UnknownFebruary 3, 2016 at 8:27 AM
Very well summarized. Thank you.
amritaMarch 1, 2016 at 3:16 AM
Thanks for the post. Can you please also share how we can plot this in SAS?
AveekJuly 15, 2016 at 12:28 AM
Thank You Deepanshu, where would i be without you ...
AnonymousDecember 17, 2016 at 7:01 AM
What is lift vs cumulative lift ?
UnknownMay 14, 2017 at 6:33 AM
Thanks Deepanshu. your article helped me to understand gain and lift charts clearly.
AnonymousApril 25, 2018 at 8:21 AM
is anyone of them multiplicative? or both are additive?
UnknownSeptember 10, 2020 at 10:03 AM
Thank you so much. it is really helpful.
Mohamed NiyazAugust 10, 2021 at 4:28 AM
This is really helpful, however I had a doubt, please correct me if I am wrong

The random model's positive expectation of xth decile should be calculated as x*10% right
Ex: 2nd decile, random expectation should be 20% of total positive responses

Please pardon if this doesn't seems correct