This tutorial explains how to apply weights in Decision Tree and Support Vector Machine models in R to handle rare event (or class imbalances) problem.
Class imbalance is a common issue that occurs when the one class may have a large number of values (the majority class), while the other class has relatively few values (the minority class). For example, In the customer attrition model, we know that churn cases (positive class) are the minority, while non-attritor cases (negative class) are the majority. In simple words, significantly lower number of attrition cases as compared to non-attrition cases.
When there is a problem of class imbalances, it is important to apply weights to fine tune model performance. Applying weights is one of the techniques used to handle class imbalances. This involves assigning different weights to the classes during the training process of the machine learning model. The purpose of applying weights is to give more importance to the minority class, making the model pay more attention to its samples during training.
How to Handle Class Imbalances in Decision Tree Models
Here we are using the "party" package in R to build a conditional inference tree (ctree) model. The following code is building the model using weighted observations to handle class imbalances in the dataset.
In this case, the weight is set to 10 for observations where the "Class" is 'churn' (assuming 'churn' is the minority class). For other observations (where "Class" is not 'churn'), the weight is set to 1.
library(party) ct1 <- ctree(Class ~ ., data=mydata, weights= ifelse(mydata$Class=='churn', 10, 1),mincriterion = 0.999)
It means giving more importance to correct classification of churn than non-attritors. The same weights function can be applied to cforest.
How to Handle Class Imbalances in Support Vector Machine Models
Support Vector Machine: Use class.weights option in the "e1071" package to handle class imbalance problem.
Here we are using the "e1071" package in R to build a support vector machine (svm) model.
library(e1071) svm.model <- svm(Class ~ ., data=mydata, type='C-classification', kernel='linear', class.weights = c(churn= 10, non-attritors = 1), scale=FALSE)
Share Share Tweet