Kombinasi Synthetic Minority Oversampling Technique (SMOTE) dan Neural Network Backpropagation untuk menangani data tidak seimbang pada prediksi pemakaian alat kontrasepsi implan
DOI:
https://doi.org/10.26594/register.v5i2.1705Keywords:
backpropagation, imbalance class, implan, implants, predict, prediksi, SMOTEAbstract
Combination of Synthetic Minority Oversampling Technique (SMOTE) and Backpropagation Neural Network to handle imbalanced class in predicting the use of contraceptive implants
Kegagalan akibat pemakaian alat kontrasepsi implan merupakan terjadinya kehamilan pada wanita saat menggunakan alat kontrasepsi secara benar. Kegagalan pemakaian kontrasepsi implan tahun 2018 secara nasional sejumlah 1.852 pengguna atau 4% dari 41.947 pengguna. Rasio angka kegagalan dan keberhasilan pemakaian kontrasepsi implan yang cenderung tidak seimbang (imbalance class) membuatnya sulit diprediksi. Ketidakseimbangan data terjadi jika jumlah data suatu kelas lebih banyak dari data lain. Kelas mayor merupakan jumlah data yang lebih banyak, sedangkan kelas minor jumlahnya lebih sedikit. Algoritma klasifikasi akan mengalami penurunan performa jika menghadapi kelas yang tidak seimbang. Synthetic Minority Oversampling Technique (SMOTE) digunakan untuk menyeimbangkan data kegagalan pemakaian kontrasepsi implan. SMOTE menghasilkan akurasi yang baik dan efektif daripada metode oversampling lainnya dalam menangani imbalance class karena mengurangi overfitting. Data yang sudah seimbang kemudian diprediksi dengan Neural Network Backpropagation. Sistem prediksi ini digunakan untuk mendeteksi apakah seorang wanita mengalami kehamilan atau tidak jika menggunakan kontrasepsi implan. Penelitian ini menggunakan 300 data, terdiri dari 285 data mayor (tidak hamil) dan 15 data minor (hamil). Dari 300 data dibagi menjadi dua bagian, 270 data latih dan 30 data uji. Dari 270 data latih, terdapat 13 data latih minor dan 257 data latih mayor. Data latih minor pada data latih diduplikasi sebanyak data pada kelas mayor sehingga jumlah data latih menjadi 514, terdiri dari 257 data mayor, 13 data minor asli, dan 244 data minor buatan. Sistem prediksi menghasilkan nilai akurasi sebesar 96,1% pada epoch ke-500 dan 1.000. Implementasi kombinasi SMOTE dan Neural Network Backpropagation terbukti mampu memprediksi pada imbalance class dengan hasil prediksi yang baik.
The failed contraceptive implant is one of the sources of unintended pregnancy in women. The number of users experiencing contraceptive-implant failure in 2018 was 1,852 nationally or 4% out of 41,947 users. The ratio between failure and success rates of contraceptive implant, which tended to be unbalanced (imbalance class), made it difficult to predict. Imbalance class will occur if the amount of data in one class is bigger than that in other classes. Major classes represent a bigger amount of data, while minor classes are smaller ones. The imbalance class will decrease the performance of the classification algorithm. The Synthetic Minority Oversampling Technique (SMOTE) was used to balance the data of the contraceptive implant failures. SMOTE resulted in better and more effective accuracy than other oversampling methods in handling the imbalance class because it reduced overfitting. The balanced data were then predicted using backpropagation neural networks. The prediction system was used to detect if a woman using a contraceptive implant was pregnant or not. This study used 300 data, consisting of 285 major data (not pregnant) and 15 minor data (pregnant). Of 300 data, two groups of data were formed: 270 training data and 30 testing data. Of 270 training data, 13 were minor training data and 257 were major training data. The minor training data in the training data were duplicated as much as the number of data in major classes so that the total training data became 514, consisting of 257 major data, 13 original minor data, and 244 artificial minor data. The prediction system resulted in an accuracy of 96.1% on the 500th and 1,000th epochs. The combination of SMOTE and Backpropagation Neural Network was proven to be able to make a good prediction result in imbalance class.
References
BKKBN, B. (2013). Pedoman penggerakan KB dan ayoman komplikasi serta kegagalan kontrasepsi. Jakarta: Direktorat Bina Kesertaan KB Jalur Pemerintah.
Budayawan, K., Yuhandri, & Nurcahyo, G. W. (2019). Implementasi Jaringan Syaraf Tiruan dalam Memprediksi Frekuensi Resonansi Atena Mikrostrip. JTIP: Jurnal Teknologi Informasi dan Pendidikan, 12(1), 33-40.
Chen, G., Fu, K., Liang, Z., Sema, T., Li, C., & Tontiwachwuthikul, P. (2014). The genetic algorithm based back propagation neural network for MMP prediction in CO2-EOR process. Fuel, 126(June), 202-212.
Chen, L., Fang, B., Shang, Z., & Tang, Y. (2018). Tackling class overlap and imbalance problems in software defect prediction. Software Quality Journal, 26(1), 97-125.
García, V., Sánchez, J. S., & Mollineda, R. A. (2012). On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems, 25(1), 13-21.
Gholami, M., Cai, N., & Brennan, R. (2013). An artificial neural network approach to the problem of wireless sensors network localization. Robotics and Computer-Integrated Manufacturing, 29(2013), 96–109.
He, H., & Ma, Y. (2013). Imbalanced Learning: Foundations, Algorithms, and Applications. Canada: Wiley.
Jian, C., Gao, J., & Ao, Y. (2016). A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing, 193(June), 115-122.
Li, H., & Sun, J. (2012). Forecasting business failure: The use of nearest-neighbour support vectors and correcting imbalanced samples–Evidence from the Chinese hotel industry. Tourism Management, 33(3), 622-634.
Liu, X.-Y., Li, Q.-Q., & Zhou, Z.-H. (2013). Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights. IEEE 13th International Conference on Data Mining. Dallas, TX, USA: IEEE.
Mutrofin, S., Mu'alif, A., Ginardi, R. V., & Fatichah, C. (2019). Solution of class imbalance of k-nearest neighbor for data of new student admission selection. International Journal Of Artificial Intelligence Research, 3(2), 47-55.
Purnamasari, R. W., Dwijanto, D., & Sugiharti, E. (2013). Implementasi Jaringan Syaraf Tiruan Backpropagation Sebagai Sistem Deteksi Penyakit Tuberculosis (TBC). Unnes Journal of Mathematics, 2(2).
Sanguanmak, Y., & Hanskunatai, A. (2016). Auto-tuning of parameters in hybrid sampling method for class imbalance problem. 2016 International Computer Science and Engineering Conference (ICSEC). Chiang Mai, Thailand: IEEE.
Sermpinis, G., Dunis, C., Laws, J., & Stasinakis, C. (2012). Forecasting and trading the EUR/USD exchange rate with stochastic Neural Network combination and time-varying leverage. Decision Support Systems, 54(1).
Shen, L., Lin, Z., & Huang, Q. (2016). Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks. European Conference on Computer Vision (ECCV 2016) (pp. 467-482). Cham: Springer.
Susanto, A. T. (2012). Aplikasi Diagnosa Kanker Serviks dengan Menggunakan Algoritma Backpropagation. Kupang: STIKOM Uyelindo.
Thammasiri, D., Delen, D., Meesad, P., & Kasap, N. (2014). A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition. Expert Systems with Applications, 41(2), 321-330.
Widodo, W., Rachman, A., & Amelia, R. (2014). Jaringan Syaraf Tiruan Prediksi Penyakit Demam Berdarah dengan Menggunakan Metode Backpropagation. Jurnal IPTEK, 18(1), 64-70.
Yap, B. W., Rani, K. A., Rahman, H. A., Fong, S., Khairudin, Z., & Abdullah, N. N. (2014). An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets. Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). 285, pp. 13-22. Singapore: Springer.
Zhang, D., Liu, W., Gong, X., & Jin, H. (2011). A Novel Improved SMOTE Resampling Algorithm Based on Fractal. Journal of Computer information Systems, 7(6), 2204-2211.
Zhu, T., Lin, Y., & Liu, Y. (2017). Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recognition, 72(December), 327-340.
Downloads
Published
How to Cite
Issue
Section
License
Please find the rights and licenses in Register: Jurnal Ilmiah Teknologi Sistem Informasi. By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
Register's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, Register permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and Register on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
Copyright and other proprietary rights relating to the article, such as patent rights,
The right to use the substance of the article in own future works, including lectures and books,
The right to reproduce the article for own purposes,
The right to self-archive the article (please read out deposit policy),
The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Register: Jurnal Ilmiah Teknologi Sistem Informasi).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. Register will not be held liable for anything that may arise due to the author(s) internal dispute. Register will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that Register entitles the author(s) to no royalties or other fees.
7. Miscellaneous
Register will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. Register's editors may modify the article to a style of punctuation, spelling, capitalization, referencing and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.