Penerapan Algoritma TextRank dan Dice Similarity Untuk Verifikasi Berita Hoax

Christian Khontoro(1*), Justinus Andjarwirawan(2), Yulia Yulia(3),


(1) Program Studi Informatika
(2) Program Studi Informatika
(3) Program Studi Informatika
(*) Corresponding Author

Abstract


Hoax or in Indonesian, hoax is fake news or news that has no source. Hoax are a series of information that is misguided, but sold as truth[5]. The problems above are the basis for creating a verification system for this hoax news. The TextRank and Dice Similarity algorithms will be used to help verify the inputed news is a hoax or fact. Where in this study, the TextRank algorithm is used to find the most important keywords in a news which will then be used to become keywords in search engines. Then the Dice Similarity algorithm is used to measure the level of similarity of the news entered with the news obtained from search results on search engines. The hoax verification system that has been done has been tested using several similarity weights to find which similarity weights are the most optimal. The data used were 50 hoax news and 50 fact news. From this test, the optimal similarity weight is 40% with an accuracy of 84%. With details of 50 hoax data, 47 news were declared hoax, 2 news items were declared facts, and 1 news was declared unknown. Of the 50 fact news, 37 news were declared facts, 13 were declared hoax, and no news was declared unknown.

Keywords


Hoax Verification; Dice Similarity; Keyword Extraction; Textrank; Web Scraping

Full Text:

PDF

References


Fikri A.D 2019. Perbandingan metode Dice Similarity

dengan Cosine Similarity menggunakan Query Expansion

pada pencarian Ayatul Ahkam dalam terjemah Alquran

berbahasa Indonesia. Teknik Informatika UIN Malang.

(4), 3-12. URL = http://etheses.uinmalang.ac.id/13814/1/13650031.pdf

Farisa F.C 2019. Ini Empat Ciri Hoaks Menurut Kominfo.

Retrieved May 30, 2020, from

https://nasional.kompas.com/read/2019/08/20/14512191/iniempat-ciri-hoaks-menurut-kominfo

Informatikalogi 2019. Vector Space Model (VSM) dan

Pengukuran Jarak pada Information Retrieval (IR).

Retrieved May 20, 2020 from

https://informatikalogi.com/vector-space-modelpengukuran-jarak/

Joshi 2018. An Introduction to Text Summarization using

the TextRank Algorithm. Retrieved May 26, 2020, from

https://www.analyticsvidhya.com/blog/2018/11/introduction

-text-summarization-textrank-python/

Kiram. 2019, 13 Juni. Yuk, Kenal Lebih Jauh tentang Hoax

biar Nggak Kemakan Hoax! Retrieved May 23, 2020, from

https://www.quipper.com/id/blog/tips-trick/kenal-lebihjauh-tentang-hoax/

Nugroho S.K 2019. Confusion Matrix untuk Evaluasi Model

pada Supervised Learning. Retrieved December 12, 2020,

from https://medium.com/@ksnugroho/confusion-matrixuntuk-evaluasi-model-pada-unsupervised-machine-learningbc4b1ae9ae3f

Sucipto & Indriati R. 2018. Deteksi Hoax Pada Media

Sosial Berbasis Text Mining Classification System. Jurnal

Informatika PGRI Kediri, 1(3),1-10. URL =

http://simki.unpkediri.ac.id/mahasiswa/file_artikel/2018/14.

03.03.0052.pdf

Syabab 2019. Apa itu Web Scrapping?. Retrieved Mei 26,

, from https://pesonainformatika.com/other-notes/apaitu-web-scraping/

Wisnubrata 2019. Dampak Buruk Berita Hoax pada

Kesehatan Mental, Ini Penjelasannya. Retrieved May 20,

, from

https://lifestyle.kompas.com/read/2019/10/08/120209420/da

mpak-buruk-berita-hoax-pada-kesehatan-mental-inipenjelasannya?page=all


Refbacks

  • There are currently no refbacks.


Jurnal telah terindeks oleh :