Credit Risk Assessment in P2P Lending Using LightGBM and Particle Swarm Optimization

Authors

  • Yosza Dasril Universiti Tun Hussein Onn Malaysia
  • Much Aziz Muslim Universiti Tun Hussein Onn Malaysia https://orcid.org/0000-0001-7405-9898
  • M. Faris Al Hakim Universitas Negeri Semarang
  • Jumanto Jumanto Universitas Negeri Semarang
  • Budi Prasetiyo Universitas Negeri Semarang

DOI:

https://doi.org/10.26594/register.v9i1.3060

Keywords:

LightGBM, PSO, P2P Lending Machine Learning, Credit Risk Assessment

Abstract

The credit risk evaluation is a vital task in the P2P Lending platform. An effective credit risk assessment method in a P2P lending platform can significantly influence investors' decisions. The machine learning algorithm that can be used to evaluate credit risk as LightGBM, however, the results in evaluating P2P lending need to be improved. The aim of this research is to improve the accuracy of the LightGBM algorithm by combining the Particle Swarm Optimization (PSO) algorithm. The novelty developed in this research is combining LightGBM with PSO for large data from the Lending Club Dataset which can be accessed on Kaggle.com. The highest accuracy also presented satisfactory results with 98.094% of accuracy, 90.514% of Recall, and 97.754% of NPV respectively. The combination of LightGBM and PSO shows better results.

References

Practices and Explorations in Digital Financial Inclusion, “China Digital Financial Inclusion Development Report,” 2017.

Z. Fang, J. Zhang, and F. Zhiyuan, “Study on P2P E-finance platform system: A case in China,” Proc. - 11th IEEE Int. Conf. E-bus. Eng. ICEBE 2014 - Incl. 10th Work. Serv. Appl. Integr. Collab. SOAIC 2014 1st Work. E-Commerce Eng. ECE 2014, pp. 331–337, 2014.

P. Renton, “Peer To Peer Lending Crosses $1 Billion In Loans Issued,” 2012. [Online]. Available: https://techcrunch.com/2012/05/29/peer-to-peer-lending-crosses-1-billion-in-loans-issued/.

S. Lee, “Evaluation of mobile application in user’s perspective: Case of P2P lending apps in FinTech industry,” KSII Trans. Internet Inf. Syst., vol. 11, no. 2, pp. 1105–1115, 2017.

C. Stern, “Fintechs and their emergence in banking services in CESEE,” Focus Eur. Econ. Integr., no. Q3/17, pp. 42–58, 2017.

E. M. Gerber, J. S. Hui, and P.-Y. Kuo, “Crowdfunding: Why people are motivated to post and fund projects on crowdfunding platforms,” Proc. Int. Work. …, no. April 2014, p. 10, 2012.

Z. Wei and M. Lin, “Market mechanisms in online peer-to-peer lending,” Manage. Sci., vol. 63, no. 12, pp. 4236–4257, 2017.

L. Ma, X. Zhao, Z. Zhou, and Y. Liu, “A new aspect on P2P online lending default prediction using meta-level phone usage data in China,” Decis. Support Syst., vol. 111, pp. 60–71, 2018.

H. Zhang, H. Zhao, Q. Liu, T. Xu, E. Chen, and X. Huang, “Finding potential lenders in P2P lending: A Hybrid Random Walk Approach,” Inf. Sci. (Ny)., vol. 432, pp. 376–391, 2018.

J. Hegde and B. Rokseth, “Applications of machine learning methods for engineering risk assessment – A review,” Saf. Sci., vol. 122, no. September 2019, p. 104492, 2020.

M. A. Muslim et al., “New model combination meta-learner to improve accuracy prediction P2P lending with stacking ensemble learning,” Intell. Syst. with Appl., vol. 18, p. 200204, May 2023.

S. C. Hsueh and C. H. Kuo, “Effective matching for P2P lending by mining strong association rules,” ACM Int. Conf. Proceeding Ser., vol. Part F1309, pp. 30–33, 2017.

D. Wang, Y. Zhang, and Y. Zhao, “LightGBM: An effective miRNA classification method in breast cancer patients,” ACM Int. Conf. Proceeding Ser., pp. 7–11, 2017.

J. Zhou, W. Li, J. Wang, S. Ding, and C. Xia, “Default prediction in P2P lending from high-dimensional data based on machine learning,” Phys. A Stat. Mech. its Appl., vol. 534, p. 122370, 2019.

Z. Liu, W. Xu, W. Zhang, and Q. Jiang, “An emotion-based personalized music recommendation framework for emotion improvement,” Inf. Process. Manag., vol. 60, no. 3, p. 103256, May 2023.

E. Fonseca, R. Gong, D. Bogdanov, O. Slizovskaia, E. Gomez, and X. Serra, “Acoustic Scene Classification by Ensembling Gradient Boosting Machine and Convolutional Neural Networks,” Detect. Classif. Acoust. Scenes Events, no. November, pp. 1–5, 2017.

N. Jha, D. Prashar, M. Rashid, S. K. Gupta, and R. K. Saket, “Electricity load forecasting and feature extraction in smart grid using neural networks,” Comput. Electr. Eng., vol. 96, p. 107479, Dec. 2021.

J. Fan, X. Ma, L. Wu, F. Zhang, X. Yu, and W. Zeng, “Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data,” Agric. Water Manag., vol. 225, no. August, p. 105758, 2019.

D. Chakraborty, H. Elhegazy, H. Elzarka, and L. Gutierrez, “A novel construction cost prediction model using hybrid natural and light gradient boosting,” Adv. Eng. Informatics, vol. 46, no. September, p. 101201, 2020.

M. R. Machado, S. Karray, and I. T. De Sousa, “LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry,” 14th Int. Conf. Comput. Sci. Educ. ICCSE 2019, no. Iccse, pp. 1111–1116, 2019.

Z. Chu, J. Yu, and A. Hamdulla, “LPG-model: A novel model for throughput prediction in stream processing, using a light gradient boosting machine, incremental principal component analysis, and deep gated recurrent unit network,” Inf. Sci. (Ny)., vol. 535, pp. 107–129, 2020.

T. Sousa, A. Silva, and A. Neves, “Particle Swarm based Data Mining Algorithms for classification tasks,” Parallel Comput., vol. 30, no. 5–6, pp. 767–783, 2004.

J. Kennedy, “Particle swarm: Social adaptation of knowledge,” Proc. IEEE Conf. Evol. Comput. ICEC, pp. 303–308, 1997.

T. Herlambang, D. Rahmalia, and T. Yulianto, “Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) for optimizing PID parameters on Autonomous Underwater Vehicle (AUV) control system,” J. Phys. Conf. Ser., vol. 1211, no. 1, 2019.

X. Ma, J. Sha, D. Wang, Y. Yu, Q. Yang, and X. Niu, “Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning,” Electron. Commer. Res. Appl., vol. 31, pp. 24–39, 2018.

C. Serrano-Cinca and B. Gutiérrez-Nieto, “The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending,” Decis. Support Syst., vol. 89, no. June, pp. 113–122, 2016.

X. Ye, L. an Dong, and D. Ma, “Loan evaluation in P2P lending based on Random Forest optimized by genetic algorithm with profit score,” Electron. Commer. Res. Appl., vol. 32, no. July, pp. 23–36, 2018.

A. C. B. Ortega and F. Bell, “Online social lending: Borrower-generated content,” 14th Am. Conf. Inf. Syst. AMCIS 2008, vol. 3, no. January 2008, pp. 1957–1964, 2008.

M. Malekipirbazari and V. Aksakalli, “Risk assessment in social lending via random forests,” Expert Syst. Appl., vol. 42, no. 10, pp. 4621–4631, 2015.

M. J. Ariza-Garzon, J. Arroyo, A. Caparrini, and M. J. Segovia-Vargas, “Explainability of a Machine Learning Granting Scoring Model in Peer-to-Peer Lending,” IEEE Access, vol. 8, pp. 64873–64890, 2020.

Y. Guo, W. Zhou, C. Luo, C. Liu, and H. Xiong, “Instance-based credit risk assessment for investment decisions in P2P lending,” Eur. J. Oper. Res., vol. 249, no. 2, pp. 417–426, 2016.

C. Serrano-Cinca, B. Gutiérrez-Nieto, and L. López-Palacios, “Determinants of default in P2P lending,” PLoS One, vol. 10, no. 10, pp. 1–22, 2015.

Y. Zhang, H. Jia, Y. Diao, M. Hai, and H. Li, “Research on Credit Scoring by Fusing Social Media Information in Online Peer-to-Peer Lending,” Procedia Comput. Sci., vol. 91, no. Itqm, pp. 168–174, 2016.

H. A. Bekhet and S. F. K. Eletter, “Credit risk assessment model for Jordanian commercial banks: Neural scoring approach,” Rev. Dev. Financ., vol. 4, no. 1, pp. 20–28, 2014.

P. Giudici, B. Hadji-Misheva, and A. Spelta, “Network Based Scoring Models to Improve Credit Risk Management in Peer to Peer Lending Platforms,” Front. Artif. Intell., vol. 2, no. May, pp. 1–8, 2019.

L. Liang and X. Cai, “Forecasting peer-to-peer platform default rate with LSTM neural network,” Electron. Commer. Res. Appl., vol. 43, p. 100997, Sep. 2020.

S. Moro, P. Cortez, and P. Rita, “A data-driven approach to predict the success of bank telemarketing,” Decis. Support Syst., vol. 62, pp. 22–31, 2014.

Y. Jin and Y. Zhu, “A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending,” Proc. - 2015 5th Int. Conf. Commun. Syst. Netw. Technol. CSNT 2015, pp. 609–613, 2015.

C. Rao, M. Liu, M. Goh, and J. Wen, “2-stage modified random forest model for credit risk assessment of P2P network lending to ‘Three Rurals’ borrowers,” Appl. Soft Comput. J., vol. 95, p. 106570, 2020.

L. Zhu, D. Qiu, D. Ergu, C. Ying, and K. Liu, “A study on predicting loan default based on the random forest algorithm,” Procedia Comput. Sci., vol. 162, no. Itqm 2019, pp. 503–513, 2019.

Y. Song, Y. Wang, X. Ye, D. Wang, Y. Yin, and Y. Wang, “Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending,” Inf. Sci. (Ny)., vol. 525, pp. 182–204, 2020.

J. Y. Kim and S. B. Cho, Deep Dense Convolutional Networks for Repayment Prediction in Peer-to-Peer Lending, vol. 771. Springer International Publishing, 2019.

C. Wang, Y. Zhang, W. Zhang, and X. Gong, “Textual sentiment of comments and collapse of P2P platforms: Evidence from China’s P2P market,” Res. Int. Bus. Financ., vol. 58, p. 101448, Dec. 2021.

R. Hamonangan, M. B. Saputro, C. Bagus, S. Dinata, and K. Atmaja, “Accuracy of classification poisonous or edible of mushroom using naïve bayes and k-nearest neighbors,” J. Soft Comput. Explor., vol. 2, no. 1, 2021.

Y. F. Safri, R. Arifudin, and M. A. Muslim, “K-Nearest Neighbor and Naive Bayes Classifier Algorithm in Determining The Classification of Healthy Card Indonesia Giving to The Poor,” Sci. J. Informatics, vol. 5, no. 1, p. 18, May 2018.

M. A. Muslim, A. J. Herowati, E. Sugiharti, and B. Prasetiyo, “Application of the pessimistic pruning to increase the accuracy of C4.5 algorithm in diagnosing chronic kidney disease,” J. Phys. Conf. Ser., vol. 983, no. 1, 2018.

I. G. A. Suciningsih, M. A. Hidayat, and R. A. Hapsari, “Comparation analysis of naïve bayes and decision tree C4.5 for caesarean section prediction,” J. Soft Comput. Explor., vol. 2, no. 1, pp. 46–52, 2021.

T. Mustaqim, K. Umam, and M. A. Muslim, “Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm,” J. Phys. Conf. Ser., vol. 1567, no. 3, p. 032024, Jun. 2020.

N. Hidayat, M. F. Al Hakim, and ..., “Halal Food Restaurant Classification Based on Restaurant Review in Indonesian Language Using Machine Learning,” Sci. J. …, vol. 8, no. 2, pp. 314–319, 2021.

Z. Huang, H. Chen, C. J. Hsu, W. H. Chen, and S. Wu, “Credit rating analysis with support vector machines and neural networks: A market comparative study,” Decis. Support Syst., vol. 37, no. 4, pp. 543–558, 2004.

Y. Xia, C. Liu, and N. Liu, “Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending,” Electron. Commer. Res. Appl., vol. 24, pp. 30–49, 2017.

N. Setiawan, Suharjito, and Diana, “A comparison of prediction methods for credit default on peer to peer lending using machine learning,” Procedia Comput. Sci., vol. 157, pp. 38–45, 2019.

R. A. Eisenbeis, “Pitfalls in the application of discriminant analysis in business, finance, and economics,” J. Finance, vol. 32, no. 3, 1977.

T. LC, “A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers,” Int. J. Forecast., vol. 16, p. 149, 2000.

S. Lessmann, B. Baesens, H. V. Seow, and L. C. Thomas, “Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research,” Eur. J. Oper. Res., vol. 247, no. 1, pp. 124–136, 2015.

M. C. So, L. C. Thomas, H. V. Seow, and C. Mues, “Using a transactor/revolver scorecard to make credit and pricing decisions,” Decis. Support Syst., vol. 59, no. 1, pp. 143–151, 2014.

T. Verbraken, C. Bravo, R. Weber, and B. Baesens, “Development and application of consumer credit scoring models using profit-based classification measures,” Eur. J. Oper. Res., vol. 238, no. 2, pp. 505–513, 2014.

Y. Wang and X. S. Ni, “Improving investment suggestions for peer-to-peer lending via integrating credit scoring into profit scoring,” ACMSE 2020 - Proc. 2020 ACM Southeast Conf., pp. 141–148, 2020.

B. F. F. Huang and P. C. Boutros, “The parameter sensitivity of random forests,” BMC Bioinformatics, vol. 17, no. 1, pp. 1–13, 2016.

W. Yuyan, D. Wang, Y Wang, and Y. Jin, “Predicting Survivability of Colorectal Cancer by an Ensemble Classification Method Improved on Random Forest,” J. Manag. Sci., vol. 30, no. 1, pp. 95–106, 2017.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Stat., vol. 29, no. 5, pp. 1189–1232, 2001.

V. K. L, S. Natarajan, S. Keerthana, K. M. Chinmayi, and N. Lakshmi, “Credit Risk Analysis in Peer-to-Peer Lending System Vinod Kumar L Keerthana S , Chinmayi K M , Lakshmi,” 2016 IEEE Int. Conf. Knowl. Eng. Appl., pp. 193–196, 2016.

Downloads

Published

2023-02-22

How to Cite

[1]
Y. Dasril, M. A. Muslim, M. F. A. Hakim, J. Jumanto, and B. Prasetiyo, “Credit Risk Assessment in P2P Lending Using LightGBM and Particle Swarm Optimization”, regist. j. ilm. teknol. sist. inf., vol. 9, no. 1, pp. 18–28, Feb. 2023.

Issue

Section

Article