How to Check Performance of a Predictive Model

There are two main measures for assessing performance of a predictive model :

Discrimination
Calibration

These measures are not restricted to logistic regression. They can be used for any classification techniques such as decision tree, random forest, gradient boosting, support vector machine (SVM) etc. The explanation of these two measures are shown below -

1. Discrimination

Discrimination refers to the ability of the model to distinguish between events and non-events.

Area under the ROC curve (AUC / C statistics)

It plots true positive rate (aka Sensitivity) and false positive rate (aka 1-Specificity). Mathematically, It is calculated using the formula below -

AUC = Concordant Percent + 0.5 * Tied Percent

Concordant : Percentage of pairs where the observation with the desired outcome (event) has a higher predicted probability than the observation without the outcome (non-event).

Discordant : Percentage of pairs where the observation with the desired outcome (event) has a lower predicted probability than the observation without the outcome (non-event).

Tied : Percentage of pairs where the observation with the desired outcome (event) has same predicted probability than the observation without the outcome (non-event).

Rules : AUC

If AUC>= 0.9, the model is considered to have outstanding discrimination. Caution : The model may be faced with problem of over-fitting.
If 0.8 <= AUC < 0.9, the model is considered to have excellent discrimination.
If 0.7<= AUC < 0.8, the model is considered to have acceptable discrimination.
If AUC = 0.5, the model has no discrimination (random case)
If AUC < 0.5, the model is worse than random

How to Calculate Concordance Manually

Gini (Somer's D)

It is a common measure for assessing predictive power of a credit risk model. It measures the degree to which the model has better discrimination power than the model with random scores.

Somer's D = 2 AUC - 1 or Somer's D = (Concordant Percent - Discordant Percent) / 100

It should be greater than 0.4.

Kolmogorov-Smirnoff Statistic (KS)

It looks at maximum difference between distribution of cumulative events and cumulative non-events.

KS statistics should be in top 3 deciles.
KS statistics should be between 40 and 70.

KS Statistics

In this case, KS is maximum at second decile and KS score is 75.

Calculating KS Test with SAS

Rank Ordering

It implies the model should predict the highest number of events in the first decile and then goes progressively down. For example, there should not be a case that the decile 2 predicts higher number of events than the first decile.

2. Calibration

It is a measure of how close the predicted probabilities are to the actual rate of events.

I. Hosmer and Lemeshow Test (HL)

It measures the association between actual events and predicted probability.

In HL test, null hypothesis states that sample of observed events and non-events supports the claim about the predicted events and non-events. In other words, the model fits data well.

Calculation

Calculate estimated probability of events
Split data into 10 sections based on descending order of probability
Calculate number of actual events and non-events in each section
Calculate Predicted Probability = 1 by averaging probability in each section
Calculate Predicted Probability = 0 by subtracting Predicted Probability=1 from 1
Calculate expected frequency by multiplying number of cases by Predicted Probability = 1
Calculate chi-square statistics taking frequency of observed (actual) and predicted events and non-events

Hosmer Lemeshow Test

Rule : If p-value > .05. the model fits data well

II. Deviance and Residual Test

The null hypothesis states the model fits the data well. In other words, null hypothesis is that the fitted model is correct.

Deviance and Residual Test

Since p-value is greater than 0.05 for both the tests, we can say the model fits the data well.

In SAS, these tests can be computed by using option scale = none aggregate in PROC LOGISTIC.

III. Brier Score

The Brier score is an important measure of calibration i.e. the mean squared difference between the predicted probability and the actual outcome.

Lower the Brier score is for a set of predictions, the better the predictions are calibrated.

If the predicted probability is 1 and it happens, then the Brier Score is 0, the best score achievable.
If the predicted probability is 1 and it does not happen, then the Brier Score is 1, the worst score achievable.
If the predicted probability is 0.8 and it happens, then the Brier Score is (0.8-1)^2 =0.04.
If the predicted probability is 0.2 and it happens, then the Brier Score is (0.2-1)^2 =0.64.
If the predicted probability is 0.5, then the Brier Score is (0.5-1)^2 =0.25, irregardless of whether it happens.

By specifying fitstat option in proc logistic, SAS returns Brier score and other fit statistics such as AUC, AIC, BIC etc.

proc logistic data=train;
model y(event="1") = entry;
score data=valid out=valpred fitstat;
run;

A complete assessment of model performance should take into consideration both discrimination and calibration. It is believed that discrimination is more important than calibration.

SAS Macro : Best Model Selection

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 12 Responses to "How to Check Performance of a Predictive Model"

UnknownJuly 15, 2015 at 8:42 AM
Awesome work man :)....great site keep it up...please add arima also :)
AnonymousNovember 4, 2015 at 3:07 AM
In the above tabulate of Hosmers lemeshow u were supposed to create 10 deciles but I can see only 8.
kmFebruary 25, 2016 at 6:43 PM
Thank you for putting this site together. You're explanations are so clear and straight to the point; very helpful.
srishailMay 18, 2016 at 12:09 PM
Cannot open the macro file..is it password protected?
srishailMay 18, 2016 at 12:09 PM
Cannot open the macro file..is it password protected?
Aditya ModakFebruary 14, 2017 at 4:41 AM
Hi, thanks for the post. The file "SAS Macro : Best Model Selection" requires a password. Whats the password ?
AnonymousMay 17, 2017 at 9:00 PM
Whats the password for the excel macro? if you are not providing the password then why you upload and make visible?
anandNovember 20, 2019 at 8:25 PM
Whats the Password for Macro file
SoumyadeepMarch 29, 2021 at 2:47 PM
password for macro file?
UnknownJune 15, 2021 at 1:53 AM
Hi, kindly provide us with the macro passwords fro the excel file