Aplikasi Analisa Sentimen Bilingual dan Emoji pada Komentar Media Sosial Instagram Menggunakan Metode Support Vector Machine
Abstract
Indonesia is ranked 4th as the most Instagram user in the world. This makes business people triggered to promote their products and services to content creators to make reviews and upload them on Instagram. Business people need to evaluate uploads to assess whether the promotions carried out get a positive or negative response from netizens. Evaluation can be done by checking the comments column. Instagram comments not only contain comments in Indonesian but in English along with emojis. However, checking manually will certainly take a lot of time. Therefore, it is necessary to build an application system that can detect bilingual sentiments and emojis in Instagram comments. This system was built using the Support Vector Machine method to classify language, Indonesian sentiment, and English sentiment and then evaluated using the accuracy value. The data used is a sample of uploaded comments in the form of posts, reels, and IGTV. The combination of preprocessing cleansing, normalization, stopwords removal, and stemming as well as parameter tuning using GridSearchCV was also tested to find the best model. The model is divided into language classification models with Indonesia, Inggris, and Campuran labels, Indonesian sentiment classifications, and English sentiment classifications with positive, neutral, and negative labels. The best accuracy obtained by the model for language classification, Indonesian sentiment, and English sentiment is 88.77%, 73.10%, and 71.56%, respectively. In addition, emojis need to be analyzed because the model that analyzes emojis has 3.875% better accuracy than the model that ignores emoji.References
[1] Ayvaz, S., & Shiha, M. O. 2017. The Effects of Emoji in
Sentiment Analysis. International Journal of Computer and
Electrical Engineering, 9(1), 360–369. DOI=
https://doi.org/10.17706/ijcee.2017.9.1.360-369.
[2] Badan Pusat Statistik. Badan Pusat Statistik. URI=
https://www.bps.go.id/pressrelease/2021/01/21/1854/hasilsensus-penduduk-2020.html.
[3] Baeldug. 2021. Multiclass Classification Using Support
Vector Machines. URI= https://www.baeldung.com/cs/svmmulticlass-classification.
[4] Bird, S., Klein, E., & Loper, E. 2009. Natural Language
Processing with Python (J. Steele (ed.)). O’Reilly Media, Inc.
[5] Christianto, M., Andjarwirawan, J., & Tjondrowiguno, A.
(2020). Aplikasi analisa sentimen pada komentar berbahasa
Indonesia dalam objek video di website YouTube
menggunakan metode Naïve Bayes classifier. Jurnal Infra,
8.1, 255–259.
[6] Deshwal, V., & Sharma, M. 2019. Breast Cancer Detection
using SVM Classifier with Grid Search Technique.
International Journal of Computer Applications, 178(31),
18–23. DOI= https://doi.org/10.5120/ijca2019919157.
[7] Joachims, T. 2001. Learning To Classify Text Using Support
Vector Machine. Library of Congress Cataloging-inPublication Data. DOI= https://doi.org/10.1007/978-1-4615-
0907-3.
[8] Kumari, U., Sharma, D. A. K. S., & Soni, D. 2017. Sentiment
Analysis of Smart Phone Product Review using SVM
Classification Technique. International Conference on
Energy, Comunication, Data Analytics and Soft Computing
(ICECDS), 1469–1474. DOI=
https://doi.org/10.1109/ICECDS.2017.8389689.
[9] Naf’an, M. Z., Bimantara, A. A., Larasati, A., Risondang, E.
M., & Nugraha, N. A. S. 2019. Sentiment Analysis of
Cyberbullying on Instagram User Comments. Journal of
Data Science and Its Applications, 2(1), 88–98. DOI=
https://doi.org/10.21108/jdsa.2019.2.20.
[10] Rahat, A. M., Kahir, A., & Masum, A. K. M. 2020.
Comparison of Naive Bayes and SVM Algorithm based on
Sentiment Analysis Using Review Dataset. Proceedings of
the 2019 8th International Conference on System Modeling
and Advancement in Research Trends, SMART 2019, June
2020, 266–270. DOI=
https://doi.org/10.1109/SMART46866.2019.9117512.
[11] Rakhmanov, O. 2020. A Comparative Study on Vectorization
and Classification Techniques in Sentiment Analysis to
Classify Student-Lecturer Comments. Procedia Computer
Science, 178, 194–204. DOI=
https://doi.org/10.1016/j.procs.2020.11.021.
[12] Rana, S., & Singh, A. 2017. Comparative analysis of
sentiment orientation using SVM and Naive Bayes
techniques. Proceedings on 2016 2nd International
Conference on Next Generation Computing Technologies,
NGCT 2016, October, 106–111. DOI=
https://doi.org/10.1109/NGCT.2016.7877399.
[13] Ross, S. (019. Being Real on Fake Instagram: Likes, Images,
and Media Ideologies of Value. Journal of Linguistic
Anthropology, 29(3), 359–374. DOI=
https://doi.org/10.1111/jola.12224.
[14] Sastrawi. 2015. sastrawi: High quality stemmer library for
Indonesian Language (Bahasa). URI=
https://github.com/sastrawi/sastrawi.
[15] Statista. 2021. Instagram: users by country. URI=
https://www.statista.com/statistics/578364/countries-withmost-instagram-users.
[16] Syarif, I., Prugel-Bennett, A., & Wills, G. 2016. SVM
Parameter Optimization using Grid Search and Genetic
Algorithm to Improve Classification Performance.
TELKOMNIKA (Telecommunication Computing Electronics
and Control), 14(4), 1502. DOI=
https://doi.org/10.12928/telkomnika.v14i4.3956.
[17] Tane, O. Z. A., Lhaksmana, K. M., & Nhita, F. 2019. Analisis
Sentimen pada Twitter Tentang Calon Presiden 2019
Menggunakan Metode SVM (Support Vector Machine).
Seminar Nasional Teknologi Fakultas Teknik Universitas
Krisnadwipayana, 1(1), 739–742.