Penerapan Algoritma TextRank dan Dice Similarity Untuk Verifikasi Berita Hoax

Authors

  • Christian Khontoro Program Studi Informatika
  • Justinus Andjarwirawan Program Studi Informatika
  • Yulia Yulia Program Studi Informatika

Abstract

Hoax or in Indonesian, hoax is fake news or news that has no source. Hoax are a series of information that is misguided, but sold as truth[5]. The problems above are the basis for creating a verification system for this hoax news. The TextRank and Dice Similarity algorithms will be used to help verify the inputed news is a hoax or fact. Where in this study, the TextRank algorithm is used to find the most important keywords in a news which will then be used to become keywords in search engines. Then the Dice Similarity algorithm is used to measure the level of similarity of the news entered with the news obtained from search results on search engines. The hoax verification system that has been done has been tested using several similarity weights to find which similarity weights are the most optimal. The data used were 50 hoax news and 50 fact news. From this test, the optimal similarity weight is 40% with an accuracy of 84%. With details of 50 hoax data, 47 news were declared hoax, 2 news items were declared facts, and 1 news was declared unknown. Of the 50 fact news, 37 news were declared facts, 13 were declared hoax, and no news was declared unknown.

References

[1] Fikri A.D 2019. Perbandingan metode Dice Similarity

dengan Cosine Similarity menggunakan Query Expansion

pada pencarian Ayatul Ahkam dalam terjemah Alquran

berbahasa Indonesia. Teknik Informatika UIN Malang.

5(4), 3-12. URL = http://etheses.uinmalang.ac.id/13814/1/13650031.pdf

[2] Farisa F.C 2019. Ini Empat Ciri Hoaks Menurut Kominfo.

Retrieved May 30, 2020, from

https://nasional.kompas.com/read/2019/08/20/14512191/iniempat-ciri-hoaks-menurut-kominfo

[3] Informatikalogi 2019. Vector Space Model (VSM) dan

Pengukuran Jarak pada Information Retrieval (IR).

Retrieved May 20, 2020 from

https://informatikalogi.com/vector-space-modelpengukuran-jarak/

[4] Joshi 2018. An Introduction to Text Summarization using

the TextRank Algorithm. Retrieved May 26, 2020, from

https://www.analyticsvidhya.com/blog/2018/11/introduction

-text-summarization-textrank-python/

[5] Kiram. 2019, 13 Juni. Yuk, Kenal Lebih Jauh tentang Hoax

biar Nggak Kemakan Hoax! Retrieved May 23, 2020, from

https://www.quipper.com/id/blog/tips-trick/kenal-lebihjauh-tentang-hoax/

[6] Nugroho S.K 2019. Confusion Matrix untuk Evaluasi Model

pada Supervised Learning. Retrieved December 12, 2020,

from https://medium.com/@ksnugroho/confusion-matrixuntuk-evaluasi-model-pada-unsupervised-machine-learningbc4b1ae9ae3f

[7] Sucipto & Indriati R. 2018. Deteksi Hoax Pada Media

Sosial Berbasis Text Mining Classification System. Jurnal

Informatika PGRI Kediri, 1(3),1-10. URL =

http://simki.unpkediri.ac.id/mahasiswa/file_artikel/2018/14.

1.03.03.0052.pdf

[8] Syabab 2019. Apa itu Web Scrapping?. Retrieved Mei 26,

2020, from https://pesonainformatika.com/other-notes/apaitu-web-scraping/

[9] Wisnubrata 2019. Dampak Buruk Berita Hoax pada

Kesehatan Mental, Ini Penjelasannya. Retrieved May 20,

2020, from

https://lifestyle.kompas.com/read/2019/10/08/120209420/da

mpak-buruk-berita-hoax-pada-kesehatan-mental-inipenjelasannya?page=all

Downloads

Published

2021-04-10

Issue

Section

Articles