Aplikasi Penentu Subyek Skripsi Menggunakan Metode Support Vector Machine

Artono Ivan Chandra(1*), Yulia Yulia(2), Rudy Adipranata(3),


(1) Program Studi Informatika
(2) Program Studi Informatika
(3) Program Studi Informatika
(*) Corresponding Author

Abstract


Thesis is a task given by the university to students as a final assessment of the learning process that has been taken for several semesters. After completing the thesis, students submit their research results to the campus as a thesis collection. At Petra Christian University, every thesis collected is given a subject as the thesis category. However, giving this subject is still manual, so we need a system that can help determine the subject of the thesis.

The system that is equipped with text mining features will help the library in determining the subject of the thesis. The steps taken are preprocessing consisting of punctual removal, stopword removal, and stemming. Then the process of extracting text data into numbers using TF-IDF. Furthermore, the data will be trained using the Support Vector Machine method which will produce a model and be used to predict subjects from input text. The trained data is the title data and abstract of the existing thesis..

The results of the research conducted showed that in the construction of the SVM classification model the parameters TF-IDF max_df 1, n-gram (1,2), smooth_idf and sublinear_tf true, linear SVM kernel with C 100 for the thesis title and max_df 0.25, n-gram (1,1), smooth_idf and sublinear_tf false, rbf SVM kernel with C 100 and gamma 0.01 for the thesis abstract. Both the title and abstract of the thesis require preprocessing, resample, and l2 normalization.


Keywords


Thesis; Text; SVM

Full Text:

PDF

References


H.M. Jogiyanto. 2005. Pengenalan Komputer. Yogyakarta, Indonesia : ANDI.

Hidayatullah, A. F., Ma’arif, M. R. 2016. Penerapan Text Mining dalam Klasifikasi Judul Skripsi. Universitas Islam Indonesia, Jawa Tengah, Yogyakarta

Kao, A., Poteet, S. 2005. Text mining and natural language processing: introduction for the special issue. SIGKDD Explorations, 7(1), 1-2. DOI= https://doi.org/10.1145/1089815.1089816

Nugroho, A. S., Witarto, A. B., & Handoko, D. 2003. Support Vector Machine.

Nugroho, K. S. 2019. Retrieved April 10, 2020, from https://medium.com/@ksnugroho/confusion-matrix-untuk-evaluasi-model-pada-unsupervised-machine-learning-bc4b1ae9ae3f

Pembobotan Kata TF-IDF. 2016. Retrieved April 10, 2020, from https://informatikalogi.com/term-weighting-tf-idf/

Pradikdo, A. C., Ristyawan, A. 2018. Model Klasifikasi Abstrak Skripsi Menggunakan Text Mining untuk Pengkategorian Skripsi Sesuai Bidang Kajian. Universitas Nusantara PGRI Kediri, Jawa Timur, Indonesia.

Pricila, J. M. 2016. Perbandingan Beberapa Pendekatan Multiclass SVM Klasifikasi Artikel Berbahasa Indonesia.

Rumangit, Y. R. Imbalanced Dataset. n.d. Retrieved from https://socs.binus.ac.id/2019/12/26/imbalanced-dataset/

Santosa, B. 2007. Data Mining Teknik Pemanfaatan Data untuk Keperluan Bisnis. Yogyakarta, Indonesia: Graha Ilmu.

Sulhaerati, Kurnia, L., Kurniansyah, A., Yuliana, A. A. 2017. Analisis Tanggapan Pasar Terhadap Perusahaan Ritel Raksasa Indomart dan Alfamart. Universitas Islam Indonesia

Universitas Komputer Indonesia, Jawa Timur, Indonesia.


Refbacks

  • There are currently no refbacks.


Jurnal telah terindeks oleh :