Penerapan Machine Learning dalam mendeteksi Fake Account pada Instagram

Hendy Gunawan, Yulia Yulia, Gregorius Satia Budhi


Instagram is the fourth most used social media in terms of the number of active users. Currently, many people are trying to increase the number of followers for other reasons such as gaining fame or wanting to be famous and trustworthy by people because they have a large number of followers. Therefore, people create fake accounts that are used to increase the number of their followers and also as a place to commit crimes such as fraud and cyberbullying. Such flexibility and spread of use has made Instagram a platform used for the proliferation of fake accounts. In this research, a website based application was designed that can detect accounts on Instagram whether they are fake or real accounts. The detection is carried out using machine learning with the Support Vector Machine, Naïve Bayes, Random Forest and Adaptive Boosting methods to detect fake or real accounts on Instagram. The method used is compared to its performance to find which method is the most appropriate in detecting fake or real accounts on Instagram. The use of k-fold cross validation is used to prevent overfitting in machine learning. Based on the tests that have been carried out, that AdaBoost can be used for account classification on Instagram with an accuracy of 92.5%, Random Forest 91.7%, Support Vector Machine 90.7% and Naïve Bayes 83.6%.


Machine Learning; Support Vector Machine; Naïve Bayes; Random Forest; Adaptive Boosting; Instagram Account Detection

Full Text:



Albayati, M., & Altamimi, A. (2019). Identifying Fake

Facebook Profiles Using Data Mining Techniques. Journal

Of ICT Research And Applications, 13(2), 107-117.

Bakhshandeh, B. (2019). Instagram fake spammer genuine

accounts. Retrieved 3 January 2022, from

Berrar, D. (2019). Cross-Validation. Encyclopedia Of

Bioinformatics And Computational Biology, 1, 542-545.

Boerman, S. (2020). The effects of the standardized

instagram disclosure for micro- and mesoinfluencers. Computers In Human Behavior, 103, 199-207.

Breiman, L. (2001). Random Forests. Machine

Learning, 45(1), 5-32.

Jiang, X., Li, Q., Ma, Z., Dong, M., Wu, J., & Guo, D. (2018).

QuickSquad: A new single-machine graph computing

framework for detecting fake accounts in large-scale social

networks. Peer-To-Peer Networking And

Applications, 12(5), 1385-1402.

Hastie, T., Rosset, S., Zhu, J., & Zou, H. (2009). Multi-class

AdaBoost. Statistics And Its Interface, 2(3), 349-360.

Kumar, A. (2020). The Ultimate Guide to AdaBoost

Algorithm | What is AdaBoost Algorithm?. GreatLearning

Blog: Free Resources what Matters to shape your Career!.

Retrieved 7 January 2022, from

Most used social media 2021 | Statista. Statista. (2022).

Retrieved 11 December 2021, from

Narkhede, S. (2018). Understanding Confusion Matrix.

Medium. Retrieved 5 January 2022, from

Pradana, G. (2021). Web Scraping Pengertian, Teknik,

Manfaat dan Kendala adalah. Ngalup Collaborative

Network. Retrieved 5 January 2022, from

Purba, K., Asirvatham, D., & Murugesan, R. (2020).

Classification of instagram fake users using supervised

machine learning algorithms. International Journal Of

Electrical And Computer Engineering (IJECE), 10(3), 2763.

Ramalingam, D., & Chinnaiah, V. (2018). Fake profile

detection techniques in large-scale online social networks: A

comprehensive review. Computers & Electrical

Engineering, 65, 165-177.

Reddy, V. (2018). Sentiment Analysis using SVM. Medium.

Retrieved 3 January 2022, from

Ruslidiantoro, A. (2021). Overfitting dan Underfitting.

Medium. Retrieved 31 April 2022, from

Pamungkas, R, I., & Lailiyah, N. (2019). PRESENTASI


UTAMA DAN AKUN ALTER. Interaksi Online, 7(4), 371-

Retrieved from

Rish, I. (2001). An empirical study of the naive Bayes

classifier. In IJCAI 2001 workshop on empirical methods in

artificial intelligence (Vol. 3, No. 22, pp. 41-46).

Shaikh, S. (2021). GitHub - shaikhsajid1111/social-mediaprofile-scrapers: Fetch user's data across social media.

GitHub. Retrieved 1 March 2022, from

Sheikhi, S. (2020). An Efficient Method for Detection of

Fake Accounts on the Instagram Platform. Revue

D'intelligence Artificielle, 34(4), 429-436.

Shin, T. (2021). Understanding Feature Importance and

How to Implement it in Python. Medium. Retrieved 11 May

, from



Sutter, B., Chiong, R., Budhi, G., & Dhakal, S. (2021).

Predicting Psychological Distress from Ecological Factors: A

Machine Learning Approach. Advances And Trends In

Artificial Intelligence. Artificial Intelligence Practices, 341-

Twin, A. (2021). How Overfitting Works. Investopedia.

Retrieved May 14, 2021, from

Wanda, P., Hiswati, M., Diqi, M., & Herlinda, R. (2021). ReFake: Klasifikasi Akun Palsu di Sosial Media Online

menggunakan Algoritma RNN. Prosiding Seminar Nasional

Sains Teknologi Dan Inovasi Indonesia (SENASTINDO), 3,


Yiu, T. (2019). Understanding Random Forest. Retrieved

May 16, 2021, from


  • There are currently no refbacks.

Jurnal telah terindeks oleh :