Journal Article10.1016/J.ESWA.2009.05.059
A data driven ensemble classifier for credit scoring analysis
Nan-Chen Hsieh,Lun-Ping Hung +1 more
173
TL;DR: This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied, and introduces the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier.
read more
Abstract: This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. This is essentially a classification task for credit scoring. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules. The learned knowledge is represented in multiple forms, including causal diagram and constrained association rules. The data driven nature of the proposed system distinguishes it from existing hybrid/ensemble credit scoring systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Benchmarking state-of-the-art classification algorithms for credit scoring
TL;DR: It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.
1.3K
Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research
TL;DR: The study of Baesens et al. (2003) is updated and several novel classification algorithms to the state-of-the-art in credit scoring are compared, providing an independent assessment of recent scoring methods and offering a new baseline to which future approaches can be compared.
948
Data mining for the Internet of Things: literature review and challenges
TL;DR: A systematic way to review data mining in knowledge view, technique view, and application view, including classification, clustering, association analysis, time series analysis and outlier analysis is given.
567
A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems
TL;DR: Comparative research review of three famous artificial intelligent techniques in financial market shows that accuracy of these artificial intelligent methods is superior to that of traditional statistical methods in dealing with financial problems, especially regarding nonlinear patterns.
542
Imbalanced enterprise credit evaluation with DTE-SBD
TL;DR: A new DT ensemble model for imbalanced enterprise credit evaluation based on the synthetic minority over-sampling technique and the Bagging ensemble learning algorithm with differentiated sampling rates is proposed, which is named as DTE-SBD (Decision Tree Ensemble based on SMOTE, Bagging and DSR).
360
References
•Book
The Nature of Statistical Learning Theory
Vladimir Vapnik
- 01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
46K
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
Gideon Schwarz
- 01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
40.6K
Financial ratios, discriminant analysis and the prediction of corporate bankruptcy
TL;DR: In this paper, a set of financial and economic ratios are investigated in a bankruptcy prediction context wherein a multiple discriminant statistical methodology is employed, and the data used in the study are limited to manufacturing corporations, where an initial sample of sixty-six firms is utilized to establish a function which best discriminates between companies in two mutually exclusive groups: bankrupt and nonbankrupt firms.
13.2K
On combining classifiers
TL;DR: A common theoretical framework for combining classifiers which use distinct pattern representations is developed and it is shown that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision.
5.8K