Improving Aspect-Based Sentiment Analysis for Hotel Reviews with Latent Dirichlet Allocation and Machine Learning Algorithms
DOI:
https://doi.org/10.26594/register.v9i2.3441Keywords:
Aspect Based Sentiment, Latent Dirichlet Allocation, Machine Learning Algorithms, Customer Service Industries, Automated Review AnalysisAbstract
The rapid expansion of online platforms has resulted in a deluge of user-generated content, emphasizing the need for sentiment analysis to gauge public opinion. Aspect-based sentiment analysis is now essential for uncovering intricate opinions within product reviews, social media posts, and online texts. Despite their potential, the complexity of human emotions and diverse language nuances pose significant challenges. Our study focuses on the importance and trends of sentiment and aspect-based sentiment analysis in automated review analysis, with a primary focus on Indonesian-language hotel reviews. Our research underscores the need for nuanced tools to unravel multifaceted sentiments. We propose an automation framework that utilizes Latent Dirichlet Allocation (LDA) for feature extraction. We evaluate LDA's performance, enhance it through filtration, and enrich it by integrating it with Word2Vec and Doc2Vec. Our methodology encompasses various machine learning algorithms, including Logistic Regression (LR), Stochastic Gradient Descent (SGD), Support Vector Machine (SVM), Random Forest (RF), and Light Gradient Boosting Machine (LGBM). Empirical results reveal that the optimal combination involves LDA bigram and Word2Vec, alongside the LGBM classifier, yielding an average F1 score of 86.6 across ten aspects. This contribution advances automated aspect-based sentiment analysis, offering concrete implications for e-commerce, marketing, and customer service. Our insights inform precise marketing strategies and enhance customer experiences, underscoring the research's relevance in the digital landscape.
References
Z. Drus and H. Khalid, "Sentiment analysis in social media and its application: Systematic literature review," Procedia Comput. Sci., vol. 161, pp. 707–714, 2019, doi: 10.1016/j.procs.2019.11.174.
S. Jabalameli, Y. Xu, and S. Shetty, "Spatial and sentiment analysis of public opinion toward COVID-19 pandemic using twitter data: At the early stage of vaccination," Int. J. Disaster Risk Reduct., vol. 80, no. January, p. 103204, 2022, doi: 10.1016/j.ijdrr.2022.103204.
R. Felipe et al., "ScienceDirect Data Science in Social Politics with Particular Emphasis Data Politics with on Sentiment Data Science Science in in Social Social Politics Analysis with Particular Particular Emphasis Emphasis on Sentiment Analysis Sentiment Analysis," Procedia Comput. Sci., vol. 214, pp. 420–427, 2022, doi: 10.1016/j.procs.2022.11.194.
G. K. Basak, P. Kumar, S. Marjit, and D. Mukherjee, "North American Journal of Economics and Finance The British Stock Market , currencies , brexit , and media sentiments?: A big data analysis," North Am. J. Econ. Financ., vol. 64, no. July 2022, p. 101861, 2023, doi: 10.1016/j.najef.2022.101861.
H. Li, B. X. B. Yu, G. Li, and H. Gao, "Restaurant survival prediction using customer-generated content: An aspect-based sentiment analysis of online reviews," Tour. Manag., vol. 96, no. January 2022, p. 104707, 2023, doi: 10.1016/j.tourman.2022.104707.
Y. Huang, R. Wang, B. Huang, B. Wei, S. L. Zheng, and M. Chen, "Sentiment Classification of Crowdsourcing Participants' Reviews Text Based on LDA Topic Model," IEEE Access, vol. 9, pp. 108131–108143, 2021, doi: 10.1109/ACCESS.2021.3101565.
A. R. Abelard and Y. Sibaroni, "Multi-aspect sentiment analysis on netflix application using latent dirichlet allocation and support vector machine methods," J. Infotel, vol. 13, no. 3, pp. 128–133, 2021, doi: 10.20895/infotel.v13i3.670.
Janu Akrama Wardhana and Yuliant Sibaroni, "Aspect Level Sentiment Analysis on Zoom Cloud Meetings App Review Using LDA," J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 4, pp. 631–638, 2021, doi: 10.29207/resti.v5i4.3143.
R. V. O. I. Sudiro, S. S. Prasetiyowati, and Y. Sibaroni, "Aspect Based Sentiment Analysis with Combination Feature Extraction LDA and Word2vec," 2021 9th Int. Conf. Inf. Commun. Technol. ICoICT 2021, pp. 611–615, 2021, doi: 10.1109/ICoICT52021.2021.9527506.
V. S. Anoop and S. Asharaf, "Aspect-oriented sentiment analysis: A topic modeling-powered approach," J. Intell. Syst., vol. 29, no. 1, pp. 1166–1178, 2020, doi: 10.1515/jisys-2018-0299.
E. Wahyudi and R. Kusumaningrum, "Aspect Based Sentiment Analysis in E-Commerce User Reviews Using Latent Dirichlet Allocation (LDA) and Sentiment Lexicon," ICICOS 2019 - 3rd Int. Conf. Informatics Comput. Sci. Accel. Informatics Comput. Res. Smarter Soc. Era Ind. 4.0, Proc., pp. 1–6, 2019, doi: 10.1109/ICICoS48119.2019.8982522.
S. Cahyawijaya et al., "IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation," EMNLP 2021 - 2021 Conf. Empir. Methods Nat. Lang. Process. Proc., pp. 8875–8898, 2021, doi: 10.18653/v1/2021.emnlp-main.699.
S. Pebiana et al., "Experimentation of Various Pre-processing Pipelines for Sentiment Analysis on Twitter Data about New Indonesia's Capital City Using SVM and CNN," 2022 25th Conf. Orient. COCOSDA Int. Comm. Co-ord. Stand. Speech Databases Assess. Tech. O-COCOSDA 2022 - Proc., 2022, doi: 10.1109/O-COCOSDA202257103.2022.9997982.
P. M. Prihatini, I. K. Suryawan, and I. N. Mandia, "Feature extraction for document text using Latent Dirichlet Allocation," J. Phys. Conf. Ser., vol. 953, no. 1, 2018, doi: 10.1088/1742-6596/953/1/012047.
N. N. Hidayati and A. Parlina, "Performance Comparison of Topic Modeling Algorithms on Indonesian Short Texts," in ACM International Conference Proceeding Series, 2022, pp. 117 – 120, doi: 10.1145/3575882.3575905.
S. Martin?i?-Ipši?, T. Mili?i?, and L. Todorovski, "The influence of feature representation of text on the performance of document classification," Appl. Sci., vol. 9, no. 4, 2019, doi: 10.3390/app9040743.
P. S. Reddy, D. R. Sri, C. S. Reddy, and S. Shaik, "Sentimental Analysis using Logistic Regression," vol. 11, no. July, pp. 36–40, 2021, doi: 10.9790/9622-1107023640.
N. Qiu, Z. Shen, X. Hu, and P. Wang, "A novel sentiment classification model based on online learning," J. Algorithm. Comput. Technol., vol. 13, no. 7186, p. 174830261984576, 2019, doi: 10.1177/1748302619845764.
T. S. Sabrila, Y. Azhar, and C. S. K. Aditya, “Analisis Sentimen Tweet Tentang UU Cipta Kerja Menggunakan Algoritma SVM Berbasis PSO,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 7, no. 1, pp. 10–19, 2022, doi: 10.14421/jiska.2022.7.1.10-19.
N. Bahrawi, "Sentiment Analysis Using Random Forest Algorithm-Online Social Media Based," J. Inf. Technol. Its Util., vol. 2, no. 2, p. 29, 2019, doi: 10.30818/jitu.2.2.2695.
F. Alzamzami, M. Hoda, and A. El Saddik, "Light Gradient Boosting Machine for General Sentiment Classification on Short Texts: A Comparative Evaluation," IEEE Access, vol. 8, pp. 101840–101858, 2020, doi: 10.1109/ACCESS.2020.2997330.
R. Prakash, "(PDF) Class Weight technique for Handling Class Imbalance," no. July, 2022, [Online]. Available: https://www.researchgate.net/publication/362066936_Class_Weight_technique_for_Handling_Class_Imbalance.
S. M. Abd Elrahman and A. Abraham, "A Review of Class Imbalance Problem," J. Netw. Innov. Comput., vol. 1, pp. 332–340, 2013, [Online]. Available: www.mirlabs.net/jnic/index.html.
J. M. Johnson and T. M. Khoshgoftaar, "Survey on deep learning with class imbalance," J. Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0192-5.
L. A. Jeni, J. F. Cohn, and F. De La Torre, "Facing imbalanced data - Recommendations for the use of performance metrics," Proc. - 2013 Hum. Assoc. Conf. Affect. Comput. Intell. Interact. ACII 2013, no. September, pp. 245–251, 2013, doi: 10.1109/ACII.2013.47.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Nuraisa Novia Hidayati
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Please find the rights and licenses in Register: Jurnal Ilmiah Teknologi Sistem Informasi. By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
Register's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, Register permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and Register on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
Copyright and other proprietary rights relating to the article, such as patent rights,
The right to use the substance of the article in own future works, including lectures and books,
The right to reproduce the article for own purposes,
The right to self-archive the article (please read out deposit policy),
The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Register: Jurnal Ilmiah Teknologi Sistem Informasi).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. Register will not be held liable for anything that may arise due to the author(s) internal dispute. Register will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that Register entitles the author(s) to no royalties or other fees.
7. Miscellaneous
Register will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. Register's editors may modify the article to a style of punctuation, spelling, capitalization, referencing and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.