Penerapan Recurrent Neural Network untuk Pembuatan Ringkasan Ekstraktif Otomatis pada Berita Berbahasa Indonesia

Kristian Halim, Henry Novianus Palit, Alvin Nathaniel Tjondrowiguno

Abstract


Technology advancement in modern world allow huge amount of information to flow everyday and news became one of the source to get that information. Because of this advancement, available information through news have been increased and so program is develop to make summary of news to reduce reading time using the neural network as the basis of this program.

The method used for training the model is Recurrent Neural Network. The type of Recurrent Neural Network that being used is Gated Recurrent Unit that is run in 2 level, the word level and then the sentence level. As for making the Recurrent Neural Network model, some experiment can be carried out, like changing initial weight of the word embedding, change the pooling method, removing dropout layer, and some preprocessing for the dataset.

The results shows that for the initial model, F1 – Score for ROUGE – 1, ROUGE – 2, and ROUGE – L can reach up to 80% when using extractive summary as the reference and up to 50% when using abstractive summary as the reference. The experiment shows that the best model is using training dataset as the initial word embedding weight, using average pooling and removing the dropout layer. The best experiment result gives F1 – Score 84.10 for ROUGE – 1, 83.10 for ROUGE – 2 and 83.31 for ROUGE – L using the extractive reference and 57.01 for ROUGE – 1, 51.17 for ROUGE – 2 and 55.10 for ROUGE – L using the abstrative reference.


Keywords


Natural Language Processing; Recurrent Neural Network

Full Text:

PDF

References


Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E., Kochut, K., & Juan, G. 2017. Text Summarization Techniques: A Brief Survey. Computation and Language.

Anagnostopoulos, A., Broder, A., Gabrilovich, E., Josifovski, V., & Riedel, L. 2011. Web page summarization for just-in-time contextual advertising. ACM Transactions on Intelligent Systems and Technology, 1-32.

Cho, K., Merrienboer, B. v., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078,.

Gunawan, D., Pasaribu, A., Rahmat, R. F., & Budiarto, R. 2017. Automatic Text Summarization for Indonesian Language. IOP Conference Series: Materials Science and Engineering, 190-196.

Kurniawan, K., & Louvan, S. 2018. IndoSum: A New Benchmark Dataset for Indonesian Text Summarization. 2018 International Conference on Asian Language Processing (IALP), 215-220

Nallapati, R., Zhai, F., & Zhou, B. 2016. SummaRuNNer: A Recurrent Neural Network based Sequence Model for. Computation and Language, 3075-3081.

Radef, D. R., Hovy, E., & McKeown, K. 2002. Introduction to the Special Issue on Summarization. Computation Linguistics, 399-408.


Refbacks

  • There are currently no refbacks.


Jurnal telah terindeks oleh :