Penerapan Algoritma TextRank dan Dice Similarity Untuk Verifikasi Berita Hoax
Abstract
Hoax or in Indonesian, hoax is fake news or news that has no source. Hoax are a series of information that is misguided, but sold as truth[5]. The problems above are the basis for creating a verification system for this hoax news. The TextRank and Dice Similarity algorithms will be used to help verify the inputed news is a hoax or fact. Where in this study, the TextRank algorithm is used to find the most important keywords in a news which will then be used to become keywords in search engines. Then the Dice Similarity algorithm is used to measure the level of similarity of the news entered with the news obtained from search results on search engines. The hoax verification system that has been done has been tested using several similarity weights to find which similarity weights are the most optimal. The data used were 50 hoax news and 50 fact news. From this test, the optimal similarity weight is 40% with an accuracy of 84%. With details of 50 hoax data, 47 news were declared hoax, 2 news items were declared facts, and 1 news was declared unknown. Of the 50 fact news, 37 news were declared facts, 13 were declared hoax, and no news was declared unknown.References
[1] Fikri A.D 2019. Perbandingan metode Dice Similarity
dengan Cosine Similarity menggunakan Query Expansion
pada pencarian Ayatul Ahkam dalam terjemah Alquran
berbahasa Indonesia. Teknik Informatika UIN Malang.
5(4), 3-12. URL = http://etheses.uinmalang.ac.id/13814/1/13650031.pdf
[2] Farisa F.C 2019. Ini Empat Ciri Hoaks Menurut Kominfo.
Retrieved May 30, 2020, from
https://nasional.kompas.com/read/2019/08/20/14512191/iniempat-ciri-hoaks-menurut-kominfo
[3] Informatikalogi 2019. Vector Space Model (VSM) dan
Pengukuran Jarak pada Information Retrieval (IR).
Retrieved May 20, 2020 from
https://informatikalogi.com/vector-space-modelpengukuran-jarak/
[4] Joshi 2018. An Introduction to Text Summarization using
the TextRank Algorithm. Retrieved May 26, 2020, from
https://www.analyticsvidhya.com/blog/2018/11/introduction
-text-summarization-textrank-python/
[5] Kiram. 2019, 13 Juni. Yuk, Kenal Lebih Jauh tentang Hoax
biar Nggak Kemakan Hoax! Retrieved May 23, 2020, from
https://www.quipper.com/id/blog/tips-trick/kenal-lebihjauh-tentang-hoax/
[6] Nugroho S.K 2019. Confusion Matrix untuk Evaluasi Model
pada Supervised Learning. Retrieved December 12, 2020,
[7] Sucipto & Indriati R. 2018. Deteksi Hoax Pada Media
Sosial Berbasis Text Mining Classification System. Jurnal
Informatika PGRI Kediri, 1(3),1-10. URL =
http://simki.unpkediri.ac.id/mahasiswa/file_artikel/2018/14.
1.03.03.0052.pdf
[8] Syabab 2019. Apa itu Web Scrapping?. Retrieved Mei 26,
2020, from https://pesonainformatika.com/other-notes/apaitu-web-scraping/
[9] Wisnubrata 2019. Dampak Buruk Berita Hoax pada
Kesehatan Mental, Ini Penjelasannya. Retrieved May 20,
2020, from
https://lifestyle.kompas.com/read/2019/10/08/120209420/da
mpak-buruk-berita-hoax-pada-kesehatan-mental-inipenjelasannya?page=all