# Calculate Hosmer Lemeshow (HL) Test with Excel

Hosmer and Lemeshow Test (HL)

It measures the association between actual events and predicted probability. In other words, it is a measure of how close the predicted probabilities are to the actual rate of events.

In HL test, null hypothesis states that sample of observed events and non-events supports the claim about the predicted events and non-events. In other words, the model fits data well.

Steps to Calculate Hosmer Lemeshow Test
1. You need to have two variables - Dependent Variable and Probability. Fit the binary classification model (like logistic regression) to your data and get the estimated probability for each observations.

2. Split data into 10 sections based on ascending order of probability.

In Excel, let's say you have probability values in cells C3:C4692. Create a new variable called "Group" and then enter the following formula in cell D3 and then paste it down to the last observation.

`=ROUND(RANK(C3,\$C\$3:\$C\$4692)/COUNT(\$C\$3:\$C\$4692)*10,0)+1`
3. Calculate the number of actual events and non-events in each group.

In Excel, enter unique values of column D from cell F4.

In cell G4, enter the following formula and then paste it down.

`=COUNTIFS(\$D\$3:\$D\$4692,F4,\$B\$3:\$B\$4692,1)`

In cell H4, enter the following formula and then paste it down.

`=COUNTIFS(\$D\$3:\$D\$4692,F4,\$B\$3:\$B\$4692,0)`
4. Calculate Predicted Probability = 1 by averaging probability in each group. In cell I4, enter the following formula and then paste it down.
`=AVERAGEIF(\$D\$3:\$D\$4692,F4,\$C\$3:\$C\$4692)`
5. Calculate Predicted Probability = 0 by subtracting Predicted Probability=1 from 1. In cell J4, enter the following formula and then paste it down.
`=1-I4`
6. Calculate expected frequency by multiplying number of cases by Predicted Probability = 1.

In cell K4, enter the following formula and then paste it down.

`=I4*SUM(G4:H4)`

In cell L4, enter the following formula and then paste it down.

`=J4*SUM(G4:H4)`
7. Calculate chi-square statistics by taking frequency of observed (actual) and predicted events and non-events.

(a) Calculate the HL Statistics : In cell M4, enter the following formula and then paste it down.

`=((G4-K4)^2/K4)+((H4-L4)^2/L4)`

(b) Chi-square is calculated by summing the HL Statistic using the formula `=SUM(M4:M11)`.

(c) P-value for the Chi-square is calculated using the following formula :

`=CHIDIST(SUM(M4:M11),COUNT(F4:F11)-2)`
 Hosmer Lemeshow Test
Rule : If p-value > .05, the model fits data well.
Related Posts