PEMODELAN TOPIK PADA PEKERJAAN BIDANG TI MENGGUNAKAN N-GRAM DAN NON-NEGATIVE MATRIX FACTORIZATION

KURNIAWAN, REGI and Praptini, Puji Catur Siswi and Aziza, Rosida Nur (2025) PEMODELAN TOPIK PADA PEKERJAAN BIDANG TI MENGGUNAKAN N-GRAM DAN NON-NEGATIVE MATRIX FACTORIZATION. Diploma thesis, ITPLN.

[thumbnail of 202131134_Regi Kurniawan_Revisi_Skripsi_Regi Kurniawan.pdf] Text
202131134_Regi Kurniawan_Revisi_Skripsi_Regi Kurniawan.pdf
Restricted to Registered users only

Download (3MB)

Abstract

Perkembangan pesat teknologi informasi telah menciptakan beragam peluang karier di bidang TI dengan persyaratan keahlian yang semakin spesifik dan kompleks. Penelitian ini bertujuan untuk mengidentifikasi dan menganalisis tema-tema utama dalam deskripsi lowongan pekerjaan bidang TI menggunakan pendekatan pemodelan topik dengan metode N-Gram dan Non-Negative Matrix Factorization (NMF). Data penelitian diperoleh dari dua sumber utama, yaitu Techinasia dan Jobstreet, yang kemudian diproses melalui tahapan preprocessing meliputi translasi, pembersihan teks, penghapusan stopwords, dan tokenisasi. Metode N-Gram diterapkan pada level unigram, bigram, dan trigram untuk mengekstraksi pola kata yang bermakna, kemudian dikombinasikan dengan teknik TF-IDF untuk memberikan bobot pada setiap term. Pemodelan topik dilakukan menggunakan algoritma NMF dengan variasi jumlah topik 5, 10, dan 15, dimana kualitas model dievaluasi menggunakan skor koherensi. Hasil penelitian menunjukkan bahwa model NMF dengan 10 topik menghasilkan performa terbaik dengan rata-rata skor koherensi sebesar 0,6350, dibandingkan dengan 5 topik (0,5727) dan 15 topik (0,6164). Analisis topik berhasil mengidentifikasi tema-tema utama dalam pekerjaan TI.

The rapid development of information technology has created diverse career opportunities in the IT field with increasingly specific and complex skill requirements. This research aims to identify and analyze main themes in IT job descriptions using topic modeling approach with N-Gram and Non-Negative Matrix Factorization (NMF) methods. Research data was obtained from two main sources, namely Techinasia and Jobstreet, which were then processed through preprocessing stages including translation, text cleaning, stopword removal, and tokenization. N-Gram method was applied at unigram, bigram, and trigram levels to extract meaningful word patterns, then combined with TF-IDF technique to weight each term. Topic modeling was performed using NMF algorithm with variations of 5, 10, and 15 topics, where model quality was evaluated using coherence scores. The results show that the NMF model with 10 topics produced the best performance with an average coherence score of 0,6350, compared to 5 topics (0,5727) and 15 topics (0,6164). Topic analysis successfully identified main themes in IT jobs.

Item Type: Thesis (Diploma)
Uncontrolled Keywords: coherence score, N-Gram, Non-Negative Matrix Factorization, pemodelan topik, text mining, coherence score, N-Gram, Non-Negative Matrix Factorization, topic modeling, text mining
Subjects: Skripsi
Bidang Keilmuan > Teknik Informatika
Divisions: Fakultas Telematika Energi > S1 Teknik Informatika
Depositing User: Sudarman
Date Deposited: 14 Oct 2025 07:15
Last Modified: 14 Oct 2025 07:15
URI: https://repository.itpln.ac.id/id/eprint/2280

Actions (login required)

View Item
View Item