Machine learning for credit scoring and loan default prediction using behavioral and transactional financial data

Abi, Roland (2025) Machine learning for credit scoring and loan default prediction using behavioral and transactional financial data. World Journal of Advanced Research and Reviews, 26 (3). pp. 884-904. ISSN 2581-9615

[thumbnail of WJARR-2025-2266.pdf] Article PDF
WJARR-2025-2266.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download ( 703kB)

Abstract

The evolution of financial services in the digital era has enabled access to alternative data streams beyond traditional credit bureau records, opening new possibilities for credit scoring and loan default prediction. In both formal banking systems and emerging fintech platforms, the integration of behavioral and transactional financial data offers richer, more dynamic insights into borrower risk profiles. This shift has paved the way for machine learning (ML) models to enhance the accuracy, fairness, and scalability of credit assessment processes. This paper investigates the application of machine learning algorithms in credit scoring and loan default prediction, using behavioral signals such as spending patterns, payment timing, mobile usage and transactional data from bank accounts, e-wallets, and point-of-sale interactions. Supervised learning techniques like logistic regression, random forests, gradient boosting, and neural networks are benchmarked against traditional credit scoring models to assess predictive performance and generalization. Additionally, the paper examines the role of unsupervised clustering for segmenting borrower profiles and semi-supervised learning for scenarios with limited labeled data. Feature engineering methods, including temporal trend extraction, merchant categorization, and transaction frequency analysis, are discussed in detail. The paper also addresses challenges related to data privacy, class imbalance, and model interpretability highlighting techniques such as SHAP values and local interpretable model-agnostic explanations (LIME) to improve transparency in ML-driven decisions. By incorporating diverse data sources and advanced analytics, ML-based credit scoring systems offer enhanced precision in predicting defaults, expanding financial inclusion while reducing systemic risk. Case studies from microfinance, mobile lending, and digital banking underscore the real-world applicability of these models in low-data and high-risk environments.

Item Type: Article
Official URL: https://doi.org/10.30574/wjarr.2025.26.3.2266
Uncontrolled Keywords: Credit Scoring; Loan Default Prediction; Machine Learning; Behavioral Data; Transactional Data; Financial Inclusion
Depositing User: Editor WJARR
Date Deposited: 20 Aug 2025 12:06
Related URLs:
URI: https://eprint.scholarsrepository.com/id/eprint/4012