An Explainable XGBoost Framework for Detecting Fraudulent Financial Transactions
Oluwatosin Lawal
Department of Mathematics Statistical Analytics, Computing and Modeling, Texas A&M university, Kingsville, USA.
Alumona Paschal
Booth School of Business, University of Chicago, USA.
Awele Okolie
*
School of Computing and Data Science, Wentworth Institute of Technology, Boston, USA.
Callistus Obunadike
Department of Computer Science and Quantitative Methods, Austin Peay State University, Tennessee, USA.
Mark Onons Ikhifa
Department of Mathematics and Science Education, Middle Tennessee State University, USA.
Samson Ikechukwu Edozie
Department of Computer Science and Quantitative Methods, Austin Peay State University, Tennessee, USA.
*Author to whom correspondence should be addressed.
Abstract
Detecting financial fraud is still one of the most urgent concerns in digital finance, thanks to the highly imbalanced and even changing nature of transactions. The new research introduces an explainable machine learning framework that leverages the eXtreme Gradient Boosting (XGBoost) algorithm to detect fraudulent activities over a very vast financial dataset. The dataset was built up of more than 280,000 transactions that have been described using anonymized numerical attributes (V1–V28), the amount of the transaction, and a binary class label based on whether the activity is legitimate or fraudulent. Exploratory data analysis detected a considerable class imbalance as fraudulent transactions represented less than 1% of the overall transactions. The XGBoost model not only reached a ROC-AUC score of 0.975 but it also attained a total accuracy of 100%, thus, through its precision and recall, it was able to differentiate between fraudulent and non-fraudulent activities. The analysis of feature importance revealed that the most significant predictors of fraud were the latent variables V14, V10, and V4. The framework presented points out the power of decision trees that are enhanced by gradient boosting in the area of financial crime detection and it also gives interpretable insights about the influence of different variables. The project has made a significant contribution to the field of XAI-based applications in financial crime prevention.
Keywords: Fraud detection, XGBoost, machine learning, financial transactions, feature importance, Explainable Artificial Intelligence (XAI), imbalanced data, ROC-AUC