Free Excel Add-In for Logistic Regression

Deepanshu Bhalla 2 Comments
Free Excel Add-In for Building Logistic Regression

This tutorial explains how to train and test a logistic regression model in Excel with one click using an add-in.

This add-in is 100% free and can run on any version of MS Excel to build logistic regression models.
How to Install Add-in

When we download add-in from internet, Microsoft blocks them from running and shows the source of this file is untrusted. Please follow the steps below to make it trusted.

  1. Go to the folder where the downloaded add-in file is located.
  2. Right-click the file and choose Properties from the menu.
  3. At the bottom of the General Tab, select the Unblock checkbox under security option and then click OK.

Refer the following steps to install Logistic Regression add-in.

  1. Open Excel and click on the File tab in the ribbon.
  2. Click on Options and then select Add-ins from the left-hand menu.
  3. In the Manage drop-down menu at the bottom of the screen, select Excel Add-ins and click on the Go button.
  4. Click on the Browse button and locate the add-in file that you downloaded.
  5. Select the add-in file and click on the OK button.
  6. You should see the name of the add-in file called "LogisticRegression" in the Add-Ins dialog box. Check the box next to the add-in name to activate it.
  7. Once you are done with the above steps, a new tab called LogisticRegression should be visible in your Excel workbook.

Step 1 : Split Data into Training and Testing Sets

The first step of building a predictive model is to split data into training and validation data sets. In Excel, under the LogisticRegression tab, click on Split Data. It creates two sheets for training and testing data.

  • Select Data (including header) : Select the cells containing data for both dependent and independent variables including their variable names.
  • Enter the split percentage for training data : Enter a value between 0 and 100 for data to be used for training model. For example, 70 refers to 70% of the data used to train the model with the remaining 30% kept for testing.
Split Data into Training and Validation in Excel

Step 2 : Model Training

The next step is to fit the logistic regression model on the training data. In Excel, under the LogisticRegression tab, click on Build Model. It returns predicted probabilities along with coefficient table and model evaluation metrics such as AUC and AIC.

  • Select Independent Variables : Select the cells containing independent variables including their variable names.
  • Select Dependent Variable : Select the cells containing dependent variable including its variable name.
Logistic Regression Model Training in Excel
Output

The add-in returns the following columns at observation level.

  • Probability : Predicted probabilities for dependent event occuring i.e. prob(y=1).
  • Log Likelihood : Log of predicted probabilities for actual dependent variable
Logistic Regression Coefficient Table

The add-in returns the following metrics for indepedent variables in a tabular format.

Logistic Regression Coefficient Table in Excel
  • Coefficients: It estimates of the relationship between each independent variable and the log odds of the dependent event occurring.
  • Standard Errors: It tells us the precision of the estimates.
  • z-statistic: It is the ratio of a coefficient to its standard error.
  • P-value: It shows the significance of the independent variable.
  • Lower95%: The lower bound of the 95% confidence interval for the coefficient.
  • Upper95%: The upper bound of the 95% confidence interval for the coefficient.
  • Odds Ratio: It is the exponential value of coefficient.
  • Std. coeff.: It is used to find out the relative importance of each independent variable.
Model Evaluation Metrics

The add-in returns the following metrics to check the performance of model.

  • AUC: Area Under the ROC Curve. It measures how well model can distinguish between events(1) and non-events(0).
  • AIC: Akaike Information Criterion. It measures the relative quality of a logistic regression model.

Step 3 : Predict on Testing Data

It refers to the process of using the trained logistic regression model to make predictions on a testing or validation dataset that was not used during the model building step. In Excel, under the LogisticRegression tab, click on Model Predictions. It returns predicted probabilities of testing dataset.

  • Select Independent Variables : Select the cells containing independent variables. DO NOT include their variable names.
  • Select Coefficients : Select the cells containing coefficients of variables including intercept.
Predict on Testing Data in Excel
Model Performance

You can check the model's performance based on the predictions.

Area Under Curve (AUC)

You can use the AUC() formula to compute area under curve on testing or new data.

=AUC(probability_column,dependent_variable_column)
=AUC(G2:G80,A2:A80)
Confusion Matrix

The confusion matrix is shown in the table below.

Predicted Negative(0)Predicted Positive(1)
Actual Negative(0)True Negative (TN)False Positive (FP)
Actual Positive(1)False Negative (FN)True Positive (TP)

In Excel, you can use the ConfusionMatrix() formula to calculate the following metrics.

  • Accuracy
  • True Positive Rate
  • False Positive Rate
  • True Negative Rate
  • False Negative Rate
=ConfusionMatrix(probability_column,dependent_variable_column,threshold)

threshold: Threshold for converting probabilities to 0/1 predictions. Many softwares use threshold of 0.5 but it should be decided based on the youden-index.

=ConfusionMatrix(G2:G80,A2:A80,0.5)
To understand the mathematical concepts involved in the steps of logistic regression, check out this tutorial - Steps to Build Logistic Regression Model in Excel
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 2 Responses to "Free Excel Add-In for Logistic Regression"
Next → ← Prev