This tutorial explains how to compute confusion matrix in Excel, along with examples.

Please ensure you have the following two columns in Excel:

- Binary Dependent Variable (0 / 1)
- Predicted probabilities

Let's say you have data for dependent variable in cells **A2:A100** and probabilities in cells **B2:B100**.

The confusion matrix is shown in the table below for your reference.

Predicted Negative(0) | Predicted Positive(1) | |
---|---|---|

Actual Negative(0) | True Negative (TN) | False Positive (FP) |

Actual Positive(1) | False Negative (FN) | True Positive (TP) |

## Step 1 : Classify Predictions

The first step is to create a column to classify predictions. Let's say probability cutoff is 0.5 which means assigning 1 if the predicted probability is 0.5 or higher, otherwise 0. Enter the following formula in cell C2.

=IF(B2 >= 0.5, 1, 0)

Drag the formula down to the last row (i.e C100) to cover all rows.

## Step 2 : Count Positives and Negatives

Enter the following formula in cell D2.

=COUNTIFS(A2:A100, 1, C2:C100, 1)

Enter the following formula in cell E2.

=COUNTIFS(A2:A100,0, C2:C100, 0)

Enter the following formula in cell F2.

=COUNTIFS(A2:A100,1, C2:C100, 0)

Enter the following formula in cell G2.

=COUNTIFS(A2:A100,0, C2:C100, 1)

## Step 3 : Calculate Confusion Matrix

The True Positive Rate is also referred to as Recall in the confusion matrix. Enter the following formula in cell H2.

=D2 / (D2 + F2)

Enter the following formula in cell I2.

=G2 / (G2 + E2)

Enter the following formula in cell J2.

=E2/(E2+G2)

Enter the following formula in cell K2.

=F2 / (F2 + D2)

Enter the following formula in cell L2.

=(D2+E2)/COUNT(C2:C100)

Enter the following formula in cell M2.

=D2/(D2+G2)

F1 score is the harmonic mean of precision and recall. It performs well on a dataset having low event rate.

=2 * (H2 * M2) / (H2 + M2)

The following VBA code is a user-defined function that automates all the steps shown above to compute confusion matrix.

```
Function ConfusionMatrix(predictions As Range, actuals As Range, threshold As Double) As Variant
Dim n As Long
Dim tpr As Double, fpr As Double, tnr As Double, fnr As Double, accuracy As Double
Dim sorted_indices() As Long
Dim i As Long
' Check if the predictions and actuals ranges have the same number of elements
If predictions.Count <> actuals.Count Then
MsgBox "Predictions and actuals must have the same number of elements."
Exit Function
End If
n = predictions.Count
ReDim sorted_indices(1 To n)
' Sort predictions and actuals in descending order of predictions
sorted_indices = SortIndicesDescending(predictions)
Dim num_pos As Long, num_neg As Long
num_pos = WorksheetFunction.CountIf(actuals, 1)
num_neg = n - num_pos
If num_pos = 0 Or num_neg = 0 Then
MsgBox "There must be both positive and negative actual values."
Exit Function
End If
Dim tp_count As Long, fp_count As Long, tn_count As Long, fn_count As Long
tp_count = 0
fp_count = 0
tn_count = 0
fn_count = 0
' Calculate TP, FP, TN, FN for the specified threshold
For i = 1 To n
If predictions.Cells(sorted_indices(i)).Value >= threshold Then
If actuals.Cells(sorted_indices(i)).Value = 1 Then
tp_count = tp_count + 1
Else
fp_count = fp_count + 1
End If
Else
If actuals.Cells(sorted_indices(i)).Value = 1 Then
fn_count = fn_count + 1
Else
tn_count = tn_count + 1
End If
End If
Next i
tpr = Round(tp_count / num_pos, 4)
fpr = Round(fp_count / num_neg, 4)
fnr = Round(fn_count / num_pos, 4)
tnr = Round(tn_count / num_neg, 4)
accuracy = Round((tp_count + tn_count) / n, 4)
' Display TPR, FPR, TNR, and FNR in one message box
ConfusionMatrix = "Accuracy: " & accuracy & ", " & _
"True Positive Rate: " & tpr & ", " & _
"False Positive Rate: " & fpr & ", " & _
"True Negative Rate: " & tnr & ", " & _
"False Negative Rate: " & fnr
End Function
```

- In Excel, open the VBA editor by pressing
**Alt + F11**keyboard shortcut key. - Select
**Insert**>**Module**to create a module. - Paste the above VBA code in the module.
- Close the VBA editor by clicking the 'X' in the top-right corner.
- In any cell, enter this formula :
=ConfusionMatrix(B2:B100,A2:A100,0.5)

It follows this syntax -`=ConfusionMatrix(probability_range,dependent_variable_range,cutoff)`

Share Share Tweet