Klasifikasi dalam Pembuatan Portal Berita Online dengan Menggunakan Metode BERT

Authors

  • Jehezkiel Hardwin Tandijaya Program Studi Informatika
  • Liliana Liliana Program Studi Informatika
  • Indar Sugiarto Program Studi Teknik Elektro

Keywords:

Eduwisata akuaponik, Denpasar, Bali, akuponik, hidroponik, akuakultur, solar energy, Tri Angga, arsitektur vernakular, less soil, urban farming.

Abstract

Internet helps human by making various information from many online news platform accessible. But nowadays, there are a lot of news that can be accessed in different online news platform and needs to be categorized. The news that can be accessed in some of the sources don’t have high credibility about an event, because the publishers use false and misleading information to push their agendas. So in order to check the credibility of an event, it is needed to also read from other sources and not only from 1 source. However, this is not effective because the reader has to look for another news source with different URL address.

In this research scraping will be done to retrieve the news that are available in a news platform. After the scraping process is done, the news will be classified to determine the category of the news. The method that will be used is Bidirectional Encoder Representations from Transformers.

From the testing of this research, the news can be retrieved and classified. The testing with a pre-trained model indobenchmark /indobert-base-p1 get a very good result where the accuracy reaches 87.548%.

References

[1] Aldwairi, M., & Alwahedi, A. 2018. Detecting fake news in

social media networks. Procedia Computer Science, 141,

215–222. https://doi.org/10.1016/j.procs.2018.10.171

[2] Ali Fauzi, M., Arifin, A. Z., Gosaria, S. C., & Prabowo, I. S.

2017. Indonesian news classification using naïve bayes and

two-phase feature selection model. Indonesian Journal of

Electrical Engineering and Computer Science, 8(3), 610–

615. https://doi.org/10.11591/ijeecs.v8.i3.pp610-615

[3] Apuke, O. D., & Omar, B. 2021. Fake news and COVID-19:

modelling the predictors of fake news sharing among social

media users. Telematics and Informatics, 56(July), 101475.

https://doi.org/10.1016/j.tele.2020.101475

[4] Aziz, A., & Rahmah, Y. 2017. Portal system for Indonesian

online newspaper - Based feed parser simple pie.

Proceedings - 2016 International Seminar on Application of

Technology for Information and Communication,

ISEMANTIC 2016, 169–173.

https://doi.org/10.1109/ISEMANTIC.2016.7873832

[5] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. 2019.

BERT: Pre-training of deep bidirectional transformers for

language understanding. NAACL HLT 2019 - 2019

Conference of the North American Chapter of the

Association for Computational Linguistics: Human

Language Technologies - Proceedings of the Conference,

1(Mlm), 4171–4186.

[6] Fang, W., Luo, H., Xu, S., Love, P. E. D., Lu, Z., & Ye, C.

2020. Automated text classification of near-misses from

safety reports: An improved deep learning approach.

Advanced Engineering Informatics, 44(March 2019),

101060. https://doi.org/10.1016/j.aei.2020.101060

[7] HaCohen-Kerner, Y., Miller, D., & Yigal, Y. 2020. The

influence of preprocessing on text classification using a bagof-words representation. PLoS ONE, 15(5), 1–22.

https://doi.org/10.1371/journal.pone.0232525

[8] Hartmann, J., Huppertz, J., Schamp, C., & Heitmann, M.

2019. Comparing automated text classification methods.

International Journal of Research in Marketing, 36(1), 20–

38. https://doi.org/10.1016/j.ijresmar.2018.09.009

[9] Kannan, S., Gurusamy, V., Vijayarani, S., Ilamathi, J.,

Nithya, M., Kannan, S., & Gurusamy, V. 2015.

Preprocessing Techniques for Text Mining. International

Journal of Computer Science & Communication Networks,

5(1), 7–16.

[10] Kasanah, A. N., Muladi, M., & Pujianto, U. 2019. Penerapan

Teknik SMOTE untuk Mengatasi Imbalance Class dalam

Klasifikasi Objektivitas Berita Online Menggunakan

Algoritma KNN. Jurnal RESTI (Rekayasa Sistem Dan

Teknologi Informasi), 3(2), 196–201.

https://doi.org/10.29207/resti.v3i2.945

[11] Kwak, K. T., Hong, S. C., & Lee, S. W. 2020. A study of

repetitive news display and news consumption in Korea.

Telematics and Informatics, 46(October 2019), 101313.

https://doi.org/10.1016/j.tele.2019.101313

[12] Mulahuwaish, A., Gyorick, K., Ghafoor, K. Z., Maghdid, H.

S., & Rawat, D. B. 2020. Efficient classification model of

web news documents using machine learning algorithms for accurate information. Computers and Security, 98.

https://doi.org/10.1016/j.cose.2020.102006

[13] Ouatik, S., Alaoui, E., & Nahnahi, N. E. 2021. Contextual

Semantic Embeddings based on Fine-tuned AraBERT Model

for Arabic Text Multi-class Categorization. Journal of King

Saud University - Computer and Information Sciences.

https://doi.org/10.1016/j.jksuci.2021.02.005

[14] Paul, S., & Saha, S. 2020. CyberBERT: BERT for

cyberbullying identification: BERT for cyberbullying

identification. Multimedia Systems, 0123456789.

https://doi.org/10.1007/s00530-020-00710-4

[15] Peng, Y., Yan, S., & Lu, Z. 2019. Transfer learning in

biomedical natural language processing: An evaluation of

BERT and ELMo on ten benchmarking datasets. ArXiv, iv.

https://doi.org/10.18653/v1/w19-5006

[16] Pramudita, Y. D., Putro, S. S., & Makhmud, N. 2018.

Klasifikasi Berita Olahraga Menggunakan Metode Naïve

Bayes dengan Enhanced Confix Stripping Stemmer. Jurnal

Teknologi Informasi Dan Ilmu Komputer, 5(3), 269.

https://doi.org/10.25126/jtiik.201853810

[17] Sari, W. K., Rini, D. P., Malik, R. F., & Azhar, I. S. B. 2017.

Klasifikasi Teks Multilabel pada Artikel Berita

Menggunakan Long Short- Term Memory dengan

Word2Vec. 1(10), 276–285.

[18] Sistem, R. 2021. Model Text-Preprocessing Komentar

Youtube Dalam Bahasa Indonesia. JURNAL RESTI

(Rekayasa Sistem Dan Teknologi Informasi), 1(10), 648–

654.

[19] Utomo, F. S., Suryana, N., & Azmi, M. S. 2020. Stemming

impact analysis on Indonesian Quran translation and their

exegesis classification for ontology instances. IIUM

Engineering Journal, 21(1), 33–50.

https://doi.org/10.31436/iiumej.v21i1.1170

[20] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. 2017.

Attention is all you need. Advances in Neural Information

Processing Systems, 2017-Decem(Nips), 5999–6009

Downloads

Published

2021-10-13

Issue

Section

Articles