Deteksi Bot Spammer Twitter Berbasis Time Interval Entropy dan Global Vectors for Word Representations Tweet’s Hashtag

Authors

  • Arif Mudi Priyatno Institut Teknologi Sepuluh Nopember
  • Muhammad Mirza Muttaqi Institut Teknologi Sepuluh Nopember
  • Fahmi Syuhada Institut Teknologi Sepuluh Nopember
  • Agus Zainal Arifin Institut Teknologi Sepuluh Nopember

DOI:

https://doi.org/10.26594/register.v5i1.1382

Keywords:

bot spammer, CNN, Glove, hashtag, Twitter,

Abstract

Bot spammer merupakan penyalahgunaan user dalam menggunakan Twitter untuk menyebarkan pesan spam sesuai dengan keinginan user. Tujuan spam mencapai trending topik yang ingin dibuatnya. Penelitian ini mengusulkan deteksi bot spammer pada Twitter berbasis Time Interval Entropy dan global vectors for word representations (Glove). Time Interval Entropy digunakan untuk mengklasifikasi akun bot berdasarkan deret waktu pembuatan tweet. Glove digunakan untuk melihat co-occurrence kata tweet yang disertai Hashtag untuk proses klasifikasi menggunakan Convolutional Neural Network (CNN). Penelitian ini menggunakan data API Twitter dari 18 akun bot dan 14 akun legitimasi dengan 1.000 tweet per akunnya. Hasil terbaik recall, precision, dan f-measure yang didapatkan yaitu 100%; 100%, dan 100%. Hal ini membuktikan bahwa Glove dan Time Interval Entropy sukses mendeteksi bot spammer dengan sangat baik. Hashtag memiliki pengaruh untuk meningkatkan deteksi bot spammer.

 

 

Spam spammers are users' misuse of using Twitter to spread spam messages in accordance with user wishes. The purpose of spam is to reach the required trending topic. This study proposes detection of bot spammers on Twitter based on Time Interval Entropy and global vectors for word representations (Glove). Time Interval Entropy is used to classify bot accounts based on the tweet's time series, while glove views the co-occurrence of tweet words with Hashtags for classification processes using the Convolutional Neural Network (CNN). This study uses Twitter API data from 18 bot accounts and 14 legitimacy accounts with 1000 tweets per account. The best results of recall, precision, and f-measure were 100%respectively. This proves that Glove and Time Interval Entropy successfully detects spams, with Hash tags able to increase the detection of bot spammers.

Author Biographies

Arif Mudi Priyatno, Institut Teknologi Sepuluh Nopember

Teknik Informatika

Muhammad Mirza Muttaqi, Institut Teknologi Sepuluh Nopember

Teknik Informatika

Fahmi Syuhada, Institut Teknologi Sepuluh Nopember

Teknik Informatika

Agus Zainal Arifin, Institut Teknologi Sepuluh Nopember

Teknik Informatika

References

Aditya, h. S., Hani’ah, M., Fitrawan, A. A., Arifin, A. Z., & Purwitasari, D. (2016). Deteksi Bot Spammer pada Twitter Berbasis Sentiment Analysis dan Time Interval Entropy. Jurnal Buana Informatika, 7(3).

Amleshwaram, A. A., Reddy, N., Yadav, S., Gu, G., & Yang, C. (2013). CATS: Characterizing automation of Twitter spammers. 2013 Fifth International Conference on Communication Systems and Networks (COMSNETS). Bangalore, India: IEEE.

Bindu, P. V., Mishra, R., & Thilagam, P. S. (2018). Discovering spammer communities in Twitter. Journal of Intelligent Information Systems, 51(3), 503–527.

Chu, Z., Gianvecchio, S., Wang, H., & Jajodia, S. (2012). Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? IEEE Transactions On Dependable And Secure Computing, 9(6), 811-824.

Daffa, W., Bamasag, O., & AlMansour, A. (2018). A Survey On Spam URLs Detection In Twitter. 2018 1st International Conference on Computer Applications & Information Security (ICCAIS). Riyadh, Saudi Arabia: IEEE.

Fields, J. D. (2016). Botnet Campaign Detection on Twitter. Utica, New York: SUNY Polytechnic Institute.

Kenter, T., Borisov, A., & Rijke, M. d. (2016, June 15). Siamese CBOW: Optimizing Word Embeddings for Sentence Representations. Retrieved from arXiv:1606.04640: https://arxiv.org/abs/1606.04640

Kuzi, S., Shtok, A., & Kurland, O. (2016). Query Expansion Using Word Embeddings. CIKM '16 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 1929-1932). Indianapolis, Indiana, USA: ACM.

Martinez-Romo, J., & Araujo, L. (2013). Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications, 40(8), 2992-3000.

Nguyen, P. T., & Takeda, H. (2016, May 14). Online learning for Social Spammer Detection on Twitter. Retrieved from arXiv: https://arxiv.org/abs/1605.04374

Pennington, J., Socher, R., & C. D. (2014). GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532-1543). Doha, Qatar: Association for Computational Linguistics.

Perdana, R. S., Muliawati, T. H., & Alexandro, R. (2015). Bot Spammer Detection In Twitter Using Tweet Similarity and Time Interval Entropy. Jurnal Ilmu Komputer dan Informasi, 8(1), 19-25.

Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61(January), 85-117.

Sedhai, S., & Sun, A. (2018). Semi-Supervised Spam Detection in Twitter Stream. IEEE Transactions On Computational Social Systems, 5(1), 169-175.

Yang, C., Harkreader, R. C., & Gu, G. (2011). Die Free or Live Hard? Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers. International Workshop on Recent Advances in Intrusion Detection (pp. 318-337). Berlin, Heidelberg: Springer.

Zhang, C. M., & Paxson, V. (2011). Detecting and Analyzing Automated Activity on Twitter. International Conference on Passive and Active Network Measurement (pp. 102-111). Berlin, Heidelberg: Springer.

Downloads

Published

2019-01-01

How to Cite

[1]
A. M. Priyatno, M. M. Muttaqi, F. Syuhada, and A. Z. Arifin, “Deteksi Bot Spammer Twitter Berbasis Time Interval Entropy dan Global Vectors for Word Representations Tweet’s Hashtag”, Register: Jurnal Ilmiah Teknologi Sistem Informasi, vol. 5, no. 1, pp. 37–46, Jan. 2019.

Issue

Section

Article