Credit Risk Assessment in P2P Lending Using LightGBM and Particle Swarm Optimization
DOI:
https://doi.org/10.26594/register.v9i1.3060Keywords:
LightGBM, PSO, P2P Lending Machine Learning, Credit Risk AssessmentAbstract
The credit risk evaluation is a vital task in the P2P Lending platform. An effective credit risk assessment method in a P2P lending platform can significantly influence investors' decisions. The machine learning algorithm that can be used to evaluate credit risk as LightGBM, however, the results in evaluating P2P lending need to be improved. The aim of this research is to improve the accuracy of the LightGBM algorithm by combining the Particle Swarm Optimization (PSO) algorithm. The novelty developed in this research is combining LightGBM with PSO for large data from the Lending Club Dataset which can be accessed on Kaggle.com. The highest accuracy also presented satisfactory results with 98.094% of accuracy, 90.514% of Recall, and 97.754% of NPV respectively. The combination of LightGBM and PSO shows better results.
References
Practices and Explorations in Digital Financial Inclusion, “China Digital Financial Inclusion Development Report,” 2017.
Z. Fang, J. Zhang, and F. Zhiyuan, “Study on P2P E-finance platform system: A case in China,” Proc. - 11th IEEE Int. Conf. E-bus. Eng. ICEBE 2014 - Incl. 10th Work. Serv. Appl. Integr. Collab. SOAIC 2014 1st Work. E-Commerce Eng. ECE 2014, pp. 331–337, 2014.
P. Renton, “Peer To Peer Lending Crosses $1 Billion In Loans Issued,” 2012. [Online]. Available: https://techcrunch.com/2012/05/29/peer-to-peer-lending-crosses-1-billion-in-loans-issued/.
S. Lee, “Evaluation of mobile application in user’s perspective: Case of P2P lending apps in FinTech industry,” KSII Trans. Internet Inf. Syst., vol. 11, no. 2, pp. 1105–1115, 2017.
C. Stern, “Fintechs and their emergence in banking services in CESEE,” Focus Eur. Econ. Integr., no. Q3/17, pp. 42–58, 2017.
E. M. Gerber, J. S. Hui, and P.-Y. Kuo, “Crowdfunding: Why people are motivated to post and fund projects on crowdfunding platforms,” Proc. Int. Work. …, no. April 2014, p. 10, 2012.
Z. Wei and M. Lin, “Market mechanisms in online peer-to-peer lending,” Manage. Sci., vol. 63, no. 12, pp. 4236–4257, 2017.
L. Ma, X. Zhao, Z. Zhou, and Y. Liu, “A new aspect on P2P online lending default prediction using meta-level phone usage data in China,” Decis. Support Syst., vol. 111, pp. 60–71, 2018.
H. Zhang, H. Zhao, Q. Liu, T. Xu, E. Chen, and X. Huang, “Finding potential lenders in P2P lending: A Hybrid Random Walk Approach,” Inf. Sci. (Ny)., vol. 432, pp. 376–391, 2018.
J. Hegde and B. Rokseth, “Applications of machine learning methods for engineering risk assessment – A review,” Saf. Sci., vol. 122, no. September 2019, p. 104492, 2020.
M. A. Muslim et al., “New model combination meta-learner to improve accuracy prediction P2P lending with stacking ensemble learning,” Intell. Syst. with Appl., vol. 18, p. 200204, May 2023.
S. C. Hsueh and C. H. Kuo, “Effective matching for P2P lending by mining strong association rules,” ACM Int. Conf. Proceeding Ser., vol. Part F1309, pp. 30–33, 2017.
D. Wang, Y. Zhang, and Y. Zhao, “LightGBM: An effective miRNA classification method in breast cancer patients,” ACM Int. Conf. Proceeding Ser., pp. 7–11, 2017.
J. Zhou, W. Li, J. Wang, S. Ding, and C. Xia, “Default prediction in P2P lending from high-dimensional data based on machine learning,” Phys. A Stat. Mech. its Appl., vol. 534, p. 122370, 2019.
Z. Liu, W. Xu, W. Zhang, and Q. Jiang, “An emotion-based personalized music recommendation framework for emotion improvement,” Inf. Process. Manag., vol. 60, no. 3, p. 103256, May 2023.
E. Fonseca, R. Gong, D. Bogdanov, O. Slizovskaia, E. Gomez, and X. Serra, “Acoustic Scene Classification by Ensembling Gradient Boosting Machine and Convolutional Neural Networks,” Detect. Classif. Acoust. Scenes Events, no. November, pp. 1–5, 2017.
N. Jha, D. Prashar, M. Rashid, S. K. Gupta, and R. K. Saket, “Electricity load forecasting and feature extraction in smart grid using neural networks,” Comput. Electr. Eng., vol. 96, p. 107479, Dec. 2021.
J. Fan, X. Ma, L. Wu, F. Zhang, X. Yu, and W. Zeng, “Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data,” Agric. Water Manag., vol. 225, no. August, p. 105758, 2019.
D. Chakraborty, H. Elhegazy, H. Elzarka, and L. Gutierrez, “A novel construction cost prediction model using hybrid natural and light gradient boosting,” Adv. Eng. Informatics, vol. 46, no. September, p. 101201, 2020.
M. R. Machado, S. Karray, and I. T. De Sousa, “LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry,” 14th Int. Conf. Comput. Sci. Educ. ICCSE 2019, no. Iccse, pp. 1111–1116, 2019.
Z. Chu, J. Yu, and A. Hamdulla, “LPG-model: A novel model for throughput prediction in stream processing, using a light gradient boosting machine, incremental principal component analysis, and deep gated recurrent unit network,” Inf. Sci. (Ny)., vol. 535, pp. 107–129, 2020.
T. Sousa, A. Silva, and A. Neves, “Particle Swarm based Data Mining Algorithms for classification tasks,” Parallel Comput., vol. 30, no. 5–6, pp. 767–783, 2004.
J. Kennedy, “Particle swarm: Social adaptation of knowledge,” Proc. IEEE Conf. Evol. Comput. ICEC, pp. 303–308, 1997.
T. Herlambang, D. Rahmalia, and T. Yulianto, “Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) for optimizing PID parameters on Autonomous Underwater Vehicle (AUV) control system,” J. Phys. Conf. Ser., vol. 1211, no. 1, 2019.
X. Ma, J. Sha, D. Wang, Y. Yu, Q. Yang, and X. Niu, “Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning,” Electron. Commer. Res. Appl., vol. 31, pp. 24–39, 2018.
C. Serrano-Cinca and B. Gutiérrez-Nieto, “The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending,” Decis. Support Syst., vol. 89, no. June, pp. 113–122, 2016.
X. Ye, L. an Dong, and D. Ma, “Loan evaluation in P2P lending based on Random Forest optimized by genetic algorithm with profit score,” Electron. Commer. Res. Appl., vol. 32, no. July, pp. 23–36, 2018.
A. C. B. Ortega and F. Bell, “Online social lending: Borrower-generated content,” 14th Am. Conf. Inf. Syst. AMCIS 2008, vol. 3, no. January 2008, pp. 1957–1964, 2008.
M. Malekipirbazari and V. Aksakalli, “Risk assessment in social lending via random forests,” Expert Syst. Appl., vol. 42, no. 10, pp. 4621–4631, 2015.
M. J. Ariza-Garzon, J. Arroyo, A. Caparrini, and M. J. Segovia-Vargas, “Explainability of a Machine Learning Granting Scoring Model in Peer-to-Peer Lending,” IEEE Access, vol. 8, pp. 64873–64890, 2020.
Y. Guo, W. Zhou, C. Luo, C. Liu, and H. Xiong, “Instance-based credit risk assessment for investment decisions in P2P lending,” Eur. J. Oper. Res., vol. 249, no. 2, pp. 417–426, 2016.
C. Serrano-Cinca, B. Gutiérrez-Nieto, and L. López-Palacios, “Determinants of default in P2P lending,” PLoS One, vol. 10, no. 10, pp. 1–22, 2015.
Y. Zhang, H. Jia, Y. Diao, M. Hai, and H. Li, “Research on Credit Scoring by Fusing Social Media Information in Online Peer-to-Peer Lending,” Procedia Comput. Sci., vol. 91, no. Itqm, pp. 168–174, 2016.
H. A. Bekhet and S. F. K. Eletter, “Credit risk assessment model for Jordanian commercial banks: Neural scoring approach,” Rev. Dev. Financ., vol. 4, no. 1, pp. 20–28, 2014.
P. Giudici, B. Hadji-Misheva, and A. Spelta, “Network Based Scoring Models to Improve Credit Risk Management in Peer to Peer Lending Platforms,” Front. Artif. Intell., vol. 2, no. May, pp. 1–8, 2019.
L. Liang and X. Cai, “Forecasting peer-to-peer platform default rate with LSTM neural network,” Electron. Commer. Res. Appl., vol. 43, p. 100997, Sep. 2020.
S. Moro, P. Cortez, and P. Rita, “A data-driven approach to predict the success of bank telemarketing,” Decis. Support Syst., vol. 62, pp. 22–31, 2014.
Y. Jin and Y. Zhu, “A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending,” Proc. - 2015 5th Int. Conf. Commun. Syst. Netw. Technol. CSNT 2015, pp. 609–613, 2015.
C. Rao, M. Liu, M. Goh, and J. Wen, “2-stage modified random forest model for credit risk assessment of P2P network lending to ‘Three Rurals’ borrowers,” Appl. Soft Comput. J., vol. 95, p. 106570, 2020.
L. Zhu, D. Qiu, D. Ergu, C. Ying, and K. Liu, “A study on predicting loan default based on the random forest algorithm,” Procedia Comput. Sci., vol. 162, no. Itqm 2019, pp. 503–513, 2019.
Y. Song, Y. Wang, X. Ye, D. Wang, Y. Yin, and Y. Wang, “Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending,” Inf. Sci. (Ny)., vol. 525, pp. 182–204, 2020.
J. Y. Kim and S. B. Cho, Deep Dense Convolutional Networks for Repayment Prediction in Peer-to-Peer Lending, vol. 771. Springer International Publishing, 2019.
C. Wang, Y. Zhang, W. Zhang, and X. Gong, “Textual sentiment of comments and collapse of P2P platforms: Evidence from China’s P2P market,” Res. Int. Bus. Financ., vol. 58, p. 101448, Dec. 2021.
R. Hamonangan, M. B. Saputro, C. Bagus, S. Dinata, and K. Atmaja, “Accuracy of classification poisonous or edible of mushroom using naïve bayes and k-nearest neighbors,” J. Soft Comput. Explor., vol. 2, no. 1, 2021.
Y. F. Safri, R. Arifudin, and M. A. Muslim, “K-Nearest Neighbor and Naive Bayes Classifier Algorithm in Determining The Classification of Healthy Card Indonesia Giving to The Poor,” Sci. J. Informatics, vol. 5, no. 1, p. 18, May 2018.
M. A. Muslim, A. J. Herowati, E. Sugiharti, and B. Prasetiyo, “Application of the pessimistic pruning to increase the accuracy of C4.5 algorithm in diagnosing chronic kidney disease,” J. Phys. Conf. Ser., vol. 983, no. 1, 2018.
I. G. A. Suciningsih, M. A. Hidayat, and R. A. Hapsari, “Comparation analysis of naïve bayes and decision tree C4.5 for caesarean section prediction,” J. Soft Comput. Explor., vol. 2, no. 1, pp. 46–52, 2021.
T. Mustaqim, K. Umam, and M. A. Muslim, “Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm,” J. Phys. Conf. Ser., vol. 1567, no. 3, p. 032024, Jun. 2020.
N. Hidayat, M. F. Al Hakim, and ..., “Halal Food Restaurant Classification Based on Restaurant Review in Indonesian Language Using Machine Learning,” Sci. J. …, vol. 8, no. 2, pp. 314–319, 2021.
Z. Huang, H. Chen, C. J. Hsu, W. H. Chen, and S. Wu, “Credit rating analysis with support vector machines and neural networks: A market comparative study,” Decis. Support Syst., vol. 37, no. 4, pp. 543–558, 2004.
Y. Xia, C. Liu, and N. Liu, “Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending,” Electron. Commer. Res. Appl., vol. 24, pp. 30–49, 2017.
N. Setiawan, Suharjito, and Diana, “A comparison of prediction methods for credit default on peer to peer lending using machine learning,” Procedia Comput. Sci., vol. 157, pp. 38–45, 2019.
R. A. Eisenbeis, “Pitfalls in the application of discriminant analysis in business, finance, and economics,” J. Finance, vol. 32, no. 3, 1977.
T. LC, “A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers,” Int. J. Forecast., vol. 16, p. 149, 2000.
S. Lessmann, B. Baesens, H. V. Seow, and L. C. Thomas, “Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research,” Eur. J. Oper. Res., vol. 247, no. 1, pp. 124–136, 2015.
M. C. So, L. C. Thomas, H. V. Seow, and C. Mues, “Using a transactor/revolver scorecard to make credit and pricing decisions,” Decis. Support Syst., vol. 59, no. 1, pp. 143–151, 2014.
T. Verbraken, C. Bravo, R. Weber, and B. Baesens, “Development and application of consumer credit scoring models using profit-based classification measures,” Eur. J. Oper. Res., vol. 238, no. 2, pp. 505–513, 2014.
Y. Wang and X. S. Ni, “Improving investment suggestions for peer-to-peer lending via integrating credit scoring into profit scoring,” ACMSE 2020 - Proc. 2020 ACM Southeast Conf., pp. 141–148, 2020.
B. F. F. Huang and P. C. Boutros, “The parameter sensitivity of random forests,” BMC Bioinformatics, vol. 17, no. 1, pp. 1–13, 2016.
W. Yuyan, D. Wang, Y Wang, and Y. Jin, “Predicting Survivability of Colorectal Cancer by an Ensemble Classification Method Improved on Random Forest,” J. Manag. Sci., vol. 30, no. 1, pp. 95–106, 2017.
J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Stat., vol. 29, no. 5, pp. 1189–1232, 2001.
V. K. L, S. Natarajan, S. Keerthana, K. M. Chinmayi, and N. Lakshmi, “Credit Risk Analysis in Peer-to-Peer Lending System Vinod Kumar L Keerthana S , Chinmayi K M , Lakshmi,” 2016 IEEE Int. Conf. Knowl. Eng. Appl., pp. 193–196, 2016.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Much Aziz Muslim, Yosza Dasril, M. Faris Al Hakim, Jumanto Jumanto , Budi Prasetiyo
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Please find the rights and licenses in Register: Jurnal Ilmiah Teknologi Sistem Informasi. By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
Register's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, Register permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and Register on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
Copyright and other proprietary rights relating to the article, such as patent rights,
The right to use the substance of the article in own future works, including lectures and books,
The right to reproduce the article for own purposes,
The right to self-archive the article (please read out deposit policy),
The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Register: Jurnal Ilmiah Teknologi Sistem Informasi).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. Register will not be held liable for anything that may arise due to the author(s) internal dispute. Register will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that Register entitles the author(s) to no royalties or other fees.
7. Miscellaneous
Register will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. Register's editors may modify the article to a style of punctuation, spelling, capitalization, referencing and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.