Predicting creditworthiness in retail banking with limited scoring data

Hussein A. Abdou, Marc D Dongmo Tsafack, Collins G. Ntim, Rose D. Baker

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

The preoccupation with modelling credit scoring systems including their relevance to predicting and decision making in the financial sector has been with developed countries, whilst developing countries have been largely neglected. The focus of our investigation is on the Cameroonian banking sector with implications for fellow members of the Banque des Etats de L'Afrique Centrale (BEAC) family which apply the same system. We apply logistic regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN) in building our knowledge-based scoring models. To compare various models' performances, we use ROC curves and Gini coefficients as evaluation criteria and the Kolmogorov-Smirnov curve as a robustness test. The results demonstrate that an improvement in terms of predicting power from 15.69% default cases under the current system, to 7.68% based on the best scoring model, namely CCNN can be achieved. The predictive capabilities of all models are rated as at least very good using the Gini coefficient; and rated excellent using the ROC curve for CCNN. Our robustness test confirmed these results. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies previous occupation, borrower's account functioning, guarantees, other loans and monthly expenses as key variables in the forecasting and decision making processes which are at the heart of overall credit policy.

LanguageEnglish
Pages89-103
Number of pages15
JournalKnowledge-Based Systems
Volume103
Early online date12 Apr 2016
DOIs
Publication statusPublished - 1 Jul 2016

Fingerprint

Neural networks
Decision making
Developing countries
Sensitivity analysis
Logistics
Cascade
Retail banking
Creditworthiness
Scoring
Receiver operating characteristic curve
Robustness test
Gini coefficient
Prediction
Logistic regression
Banking sector
Functioning
Decision-making process
Developed countries
Loans
Modeling

Cite this

Abdou, Hussein A. ; Tsafack, Marc D Dongmo ; Ntim, Collins G. ; Baker, Rose D. / Predicting creditworthiness in retail banking with limited scoring data. In: Knowledge-Based Systems. 2016 ; Vol. 103. pp. 89-103.
@article{6c85183394d34bdc89f7824254ef8dee,
title = "Predicting creditworthiness in retail banking with limited scoring data",
abstract = "The preoccupation with modelling credit scoring systems including their relevance to predicting and decision making in the financial sector has been with developed countries, whilst developing countries have been largely neglected. The focus of our investigation is on the Cameroonian banking sector with implications for fellow members of the Banque des Etats de L'Afrique Centrale (BEAC) family which apply the same system. We apply logistic regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN) in building our knowledge-based scoring models. To compare various models' performances, we use ROC curves and Gini coefficients as evaluation criteria and the Kolmogorov-Smirnov curve as a robustness test. The results demonstrate that an improvement in terms of predicting power from 15.69{\%} default cases under the current system, to 7.68{\%} based on the best scoring model, namely CCNN can be achieved. The predictive capabilities of all models are rated as at least very good using the Gini coefficient; and rated excellent using the ROC curve for CCNN. Our robustness test confirmed these results. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies previous occupation, borrower's account functioning, guarantees, other loans and monthly expenses as key variables in the forecasting and decision making processes which are at the heart of overall credit policy.",
keywords = "CART, Cascade correlation neural networks, Credit scoring, Limited data, Predicting creditworthiness",
author = "Abdou, {Hussein A.} and Tsafack, {Marc D Dongmo} and Ntim, {Collins G.} and Baker, {Rose D.}",
year = "2016",
month = "7",
day = "1",
doi = "10.1016/j.knosys.2016.03.023",
language = "English",
volume = "103",
pages = "89--103",
journal = "Knowledge-Based Systems",
issn = "0950-7051",
publisher = "Elsevier",

}

Predicting creditworthiness in retail banking with limited scoring data. / Abdou, Hussein A.; Tsafack, Marc D Dongmo; Ntim, Collins G.; Baker, Rose D.

In: Knowledge-Based Systems, Vol. 103, 01.07.2016, p. 89-103.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Predicting creditworthiness in retail banking with limited scoring data

AU - Abdou, Hussein A.

AU - Tsafack, Marc D Dongmo

AU - Ntim, Collins G.

AU - Baker, Rose D.

PY - 2016/7/1

Y1 - 2016/7/1

N2 - The preoccupation with modelling credit scoring systems including their relevance to predicting and decision making in the financial sector has been with developed countries, whilst developing countries have been largely neglected. The focus of our investigation is on the Cameroonian banking sector with implications for fellow members of the Banque des Etats de L'Afrique Centrale (BEAC) family which apply the same system. We apply logistic regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN) in building our knowledge-based scoring models. To compare various models' performances, we use ROC curves and Gini coefficients as evaluation criteria and the Kolmogorov-Smirnov curve as a robustness test. The results demonstrate that an improvement in terms of predicting power from 15.69% default cases under the current system, to 7.68% based on the best scoring model, namely CCNN can be achieved. The predictive capabilities of all models are rated as at least very good using the Gini coefficient; and rated excellent using the ROC curve for CCNN. Our robustness test confirmed these results. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies previous occupation, borrower's account functioning, guarantees, other loans and monthly expenses as key variables in the forecasting and decision making processes which are at the heart of overall credit policy.

AB - The preoccupation with modelling credit scoring systems including their relevance to predicting and decision making in the financial sector has been with developed countries, whilst developing countries have been largely neglected. The focus of our investigation is on the Cameroonian banking sector with implications for fellow members of the Banque des Etats de L'Afrique Centrale (BEAC) family which apply the same system. We apply logistic regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN) in building our knowledge-based scoring models. To compare various models' performances, we use ROC curves and Gini coefficients as evaluation criteria and the Kolmogorov-Smirnov curve as a robustness test. The results demonstrate that an improvement in terms of predicting power from 15.69% default cases under the current system, to 7.68% based on the best scoring model, namely CCNN can be achieved. The predictive capabilities of all models are rated as at least very good using the Gini coefficient; and rated excellent using the ROC curve for CCNN. Our robustness test confirmed these results. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies previous occupation, borrower's account functioning, guarantees, other loans and monthly expenses as key variables in the forecasting and decision making processes which are at the heart of overall credit policy.

KW - CART

KW - Cascade correlation neural networks

KW - Credit scoring

KW - Limited data

KW - Predicting creditworthiness

UR - http://www.scopus.com/inward/record.url?scp=84964683503&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2016.03.023

DO - 10.1016/j.knosys.2016.03.023

M3 - Article

VL - 103

SP - 89

EP - 103

JO - Knowledge-Based Systems

T2 - Knowledge-Based Systems

JF - Knowledge-Based Systems

SN - 0950-7051

ER -