Identifying Degree-of-Concern on COVID-19 topics with text classification of Twitters
DOI:
https://doi.org/10.26594/register.v7i1.2234Keywords:
COVID-19, degree-of-concern, Deep Learning, Twitter text classification, word embeddingAbstract
The COVID-19 pandemic has various impacts on changing people’s behavior socially and individually. This study identifies the Degree-of-Concern topic of COVID-19 through citizen conversations on Twitter. It aims to help related parties make policies for developing appropriate emergency response strategies in dealing with changes in people’s behavior due to the pandemic. The object of research is 12,000 data from verified Twitter accounts in Surabaya. The varied nature of Twitter needs to be classified to address specific COVID-19 topics. The first stage of classification is to separate Twitter data into COVID-19 and non-COVID-19. The second stage is to classify the COVID-19 data into seven classes: warnings and suggestions, notification of information, donations, emotional support, seeking help, criticism, and hoaxes. Classification is carried out using a combination of word embedding (Word2Vec and fastText) and deep learning methods (CNN, RNN, and LSTM). The trial was carried out with three scenarios with different numbers of train data for each scenario. The classification results show the highest accuracy is 97.3% and 99.4% for the first and second stage classification obtained from the combination of fastText and LSTM. The results show that the classification of the COVID-19 topic can be used to identify Degree-of-Concern properly. The results of the Degree-of-Concern identification based on the classification can be used as a basis for related parties in making policies to formulate appropriate emergency response strategies in dealing with changes in public behavior due to a pandemic.References
[1] X. Zhu, S. Wu, D. Miao and Y. Li, "Changes in Emotion of The Chinese Public in Regard to the Sars Period," Social Behavior and Personality: an international journal, vol. 36, no. 4, pp. 447-454, 2008.
[2] X. Ji, S. A. Chun, Z. Wei and J. Geller, "Twitter sentiment classification for measuring public health concerns," Soc. Netw. Anal. Min., vol. 5, no. 13, 2015.
[3] K. Lee, D. Palsetia, R. Narayanan, M. M. A. Patwary, A. Agrawal and A. Choudhary, "Twitter Trending Topic Classification," in 2011 IEEE 11th International Conference on Data Mining Workshops, Vancouver, BC, Canada, 2011.
[4] E. M. Glowacki, J. B. Glowacki and G. B. Wilcox, "A text-mining analysis of the public's reactions to the opioid crisis," Substance Abuse, vol. 39, no. 2, pp. 129-133, 2018.
[5] T. Rathod and M. Barot, "Trend Analysis on Twitter for Predicting Public Opinion on Ongoing Events," International Journal of Computer Applications, vol. 180, no. 26, pp. 13-17, 2018.
[6] L. Yan and A. J. Pedraza‐Martinez, "Social Media for Disaster Management: Operational Value of the Social Conversation," Production and Operations Management , vol. 28, no. 10, pp. 2514-2532, 2019.
[7] S. Vieweg, A. L. Hughes, K. Starbird and L. Palen, "Microblogging During Two Natural Hazards Events: What Twitter May Contribute to Situational Awareness," in CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Atlanta, GA, USA, 2010.
[8] S. Kemp, "Digital 2019: Indonesia," DataReportal, 2019.
[9] C. Chew and G. Eysenbach, "Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak," PLoS ONE, vol. 5, no. 11, p. e14118, 2010.
[10] X. Ji, S. A. Chun and J. Geller, "Monitoring Public Health Concerns Using Twitter Sentiment Classifications," in 2013 IEEE International Conference on Healthcare Informatics, Philadelphia, PA, USA, 2013.
[11] L. Li, Q. Zhang, X. Wang, J. Zhang, T. Wang, T.-L. Gao, W. Duan, K. K.-f. Tsoi and F.-Y. Wan, "Characterizing the Propagation of Situational Information in Social Media During COVID-19 Epidemic: A Case Study on Weibo," IEEE Transactions on Computational Social Systems, vol. 7, no. 2, pp. 556-562, 2020.
[12] S. Boukil, M. Biniz, F. E. Adnani, L. Cherrat and A. E. E. Moutaouakkil, "Arabic Text Classification Using Deep Learning Technics," International Journal of Grid and Distributed Computing, vol. 11, no. 9, pp. 103-114, 2018.
[13] R. A. Calix, R. Gupta, M. Gupta and K. Jiang, "Deep gramulator: Improving precision in the classification of personal health-experience tweets with deep learning," in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, 2017.
[14] Y. Kim, "Convolutional Neural Networks for Sentence Classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014.
[15] M. Hughes, I. Li, S. Kotoulas and T. Suzumura, "Medical Text Classification Using Convolutional Neural Networks," Stud Health Technol Inform, vol. 235, pp. 246-250, 2017.
[16] A. Severyn and A. Moschitti, "UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification," in Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, 2015.
[17] K. Cho, B. v. Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, "Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014.
[18] A. Tholusuri, M. Anumala, B. Malapolu and G. J. Lakshmi, "Sentiment Analysis using LSTM," International Journal of Engineering and Advanced Technology (IJEAT), vol. 8, no. 6S3, pp. 1338-1340, 2019.
[19] A. Rao and N. Spasojevic, "Actionable and Political Text Classification using Word Embeddings and LSTM," arXiv, 2016.
[20] B. Wang, A. Wang, F. Chen, Y. Wang and C.-C. J. Kuo, "Evaluating word embedding models: Methods and experimental results," APSIPA Transactions on Signal and Information Processing, vol. 8, no. E19, 2019.
[21] B. Kuyumcu, C. Aksakalli and S. Delil, "An automated new approach in fast text classification (fastText): A case study for Turkish text classification without pre-processing," in NLPIR 2019: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval, Tokushima, Japan, 2019.
[22] F. K. Khattak, S. Jeblee, C. Pou-Prom, M. Abdalla, C. Meaney and F. Rudzicz, "A survey of word embeddings for clinical text," Journal of Biomedical Informatics: X, vol. 4, 2019.
[23] A. Mandelbaum and A. Shalev, "Word Embeddings and Their Use In Sentence Classification Tasks," arXiv, 2016.
[24] T. Mikolov, K. Chen, G. Corrado and J. Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv, 2013.
[25] I. W. S. E. Putra, "Klasifikasi Citra Menggunakan Convolutional Neural Network (CNN) Pada Caltech 101," Institut Teknologi Sepuluh Nopember, Surabaya, 2016.
[26] P. Liu, X. Qiu and X. Huang, "Recurrent neural network for text classification with multi-task learning," in IJCAI'16: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016.
[27] D. A. Nasution, H. H. Khotimah and N. Chamidah, "Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma k-NN," CESS (Journal of Computer Engineering, System and Science), vol. 4, no. 1, pp. 78-82, 2019.
[28] S. Dabiri and K. Heaslip, "Developing a Twitter-based traffic event detection model using deep learning architectures," Expert Systems with Applications, vol. 118, pp. 425-439, 2019.
[29] C. Zhou, C. Sun, Z. Liu and F. C. Lau, "A C-LSTM Neural Network for Text Classification," arXiv, 2015.
[30] Z. Zhang, Q. He, J. Gao and M. Ni, "A deep learning approach for detecting traffic accidents from social media data," Transportation Research Part C: Emerging Technologies, vol. 86, pp. 580-596, 2018.
[31] G. Liu, X. Xu, B. Deng, S. Chen and L. Li, "A hybrid method for bilingual text sentiment classification based on deep learning," in 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Shanghai, China, 2016.
[32] C. Du and L. Huang, "Text Classification Research with Attention-based Recurrent Neural Networks," International Journal of Computers Communications & Control, vol. 13, no. 1, pp. 50-61, 2018.
Downloads
Published
How to Cite
Issue
Section
License
Please find the rights and licenses in Register: Jurnal Ilmiah Teknologi Sistem Informasi. By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
Register's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, Register permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and Register on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
Copyright and other proprietary rights relating to the article, such as patent rights,
The right to use the substance of the article in own future works, including lectures and books,
The right to reproduce the article for own purposes,
The right to self-archive the article (please read out deposit policy),
The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Register: Jurnal Ilmiah Teknologi Sistem Informasi).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. Register will not be held liable for anything that may arise due to the author(s) internal dispute. Register will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that Register entitles the author(s) to no royalties or other fees.
7. Miscellaneous
Register will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. Register's editors may modify the article to a style of punctuation, spelling, capitalization, referencing and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.