PREDICTING CUSTOMER CHURN IN DISTANCE LEARNING: A COMPARATIVE STUDY OF XGBOOST, BAGGING CLASSIFIER, AND SMOTE FOR ENHANCED MODEL PERFORMANCE AND GENERALIZATION

Authors

  • Paril Ghori Author

Keywords:

Accuracy, Area Under the Curve (AUC), Bagging Classifier, Churn Prediction, Distance Learning, Machine Learning, Predictive Modeling, SMOTE, XGBoost.

Abstract

The prediction of customer churn has become a central focus in managing customer relationships, especially for businesses in the distance learning sector. Churn prediction models aim to identify users with a high likelihood of attrition, enabling companies to enhance the effectiveness of their customer retention efforts and reduce churn-associated costs. Although the primary goal of these models is cost reduction and retention, their performance is often assessed using statistical metrics and computational tools, such as machine learning techniques. This study focuses on developing and validating churn prediction models using data from over 150,000 customers of an online education company. The objective was to compare the performance of various machine learning algorithms implemented for churn prediction. Thirteen variables were selected from the literature, and the models were developed through four main steps: (I) training on balanced and unbalanced datasets, (II) generalization/testing on an independent dataset, (III) statistical comparison of the algorithms, and (IV) evaluation of the models with the highest accuracy. The results showed that the best-performing model for unbalanced classes was XGBoost, with an average accuracy of 87.11% and an average AUC (Area Under the Curve) of 0.86. For the balanced classes, the Bagging Classifier performed the best, achieving an average accuracy of 77.34% and an average AUC of 0.83 during both testing and generalization phases.

Downloads

Published

2020-05-30

Issue

Section

Articles

How to Cite

PREDICTING CUSTOMER CHURN IN DISTANCE LEARNING: A COMPARATIVE STUDY OF XGBOOST, BAGGING CLASSIFIER, AND SMOTE FOR ENHANCED MODEL PERFORMANCE AND GENERALIZATION. (2020). International Journal of Engineering Sciences & Management Research, 7(5), 1-12. https://ijesmr.com/index.php/ijesmr/article/view/488