KLASIFIKASI BERITA HOAX BERBAHASA INDONESIA MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE

MESTI, DIMAS ABI and Karmila, Sely and Djamain, Yasni (2025) KLASIFIKASI BERITA HOAX BERBAHASA INDONESIA MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE. Diploma thesis, ITPLN.

[thumbnail of 202131093_Dimas Abi Mesti_Revisi_Skripsi_Dimas Abi Mesti.pdf] Text
202131093_Dimas Abi Mesti_Revisi_Skripsi_Dimas Abi Mesti.pdf
Restricted to Registered users only

Download (2MB)

Abstract

Fenomena berita bohong atau hoaks menjadi masalah serius di era digital, yang berpotensi merusak stabilitas sosial dan politik. Penelitian ini bertujuan untuk mengembangkan model klasifikasi berita hoaks berbahasa Indonesia yang efisien menggunakan algoritma Support Vector Machine (SVM). Data yang digunakan adalah dataset berita berbahasa Indonesia dari Kaggle yang terdiri dari 3.000 berita hoaks dan 3.000 berita valid. Proses pra-pemrosesan data melibatkan tahapan case folding, cleaning, tokenizing, stopword removal, dan stemming. Representasi fitur teks diubah menjadi bentuk numerik menggunakan metode Term Frequency-Inverse Document Frequency (TF-IDF). Hasil pengujian menunjukkan bahwa model SVM memiliki performa yang sangat baik dalam mendeteksi berita hoaks, dengan akurasi mencapai 97,16%, presisi 98%, recall 96%, dan F1-Score 97%. Analisis lebih lanjut terhadap beberapa kernel SVM menunjukkan bahwa kernel linear adalah yang paling optimal dengan akurasi 97,16%. Hasil penelitian ini membuktikan bahwa model klasifikasi berbasis SVM dapat menjadi alat bantu yang efektif untuk pre-screening dan penanganan berita hoaks secara otomatis.

The phenomenon of fake news or hoaxes has become a serious problem in the digital age, with the potential to damage social and political stability. This study aims to develop an efficient classification model for Indonesian-language hoax news using the Support Vector Machine (SVM) algorithm. The dataset used consists of 1,000 hoax news articles and 1,000 valid news articles sourced from Kaggle. The data pre-processing stages include case folding, cleaning, tokenizing, stopword removal, and stemming. The text features were converted into a numerical representation using the Term Frequency-Inverse Document Frequency (TF-IDF) method. The results show that the SVM model performs very well in detecting hoax news, achieving an accuracy of 97.16%, a precision of 98%, a recall of 96%, and a F1-Score of 97%. Further analysis of several SVM kernels indicates that the linear kernel is the most optimal, with an accuracy of 97.16%. The findings of this research demonstrate that an SVM-based classification model can be an effective tool for the automated pre-screening and handling of hoax news.

Item Type: Thesis (Diploma)
Uncontrolled Keywords: Klasifikasi Teks, Hoaks, Bahasa Indonesia, Support Vector Machine, TF-IDF Text Classification, Hoax, Indonesian Language, Support Vector Machine, TF-IDF
Subjects: Skripsi
Bidang Keilmuan > Teknik Informatika
Divisions: Fakultas Telematika Energi > S1 Teknik Informatika
Depositing User: Sudarman
Date Deposited: 13 Oct 2025 02:07
Last Modified: 13 Oct 2025 02:07
URI: https://repository.itpln.ac.id/id/eprint/2076

Actions (login required)

View Item
View Item