Journal article

Fuzzy-Gibbs Latent Dirichlet Allocation Model for Feature Extraction on Indonesian Documents

Putu Manik Prihatini I Ketut Gede Darma Putra IDA AYU DWI GIRIANTARI Made Sudarma

Volume : 10 Nomor : 9 Published : 2017, August

Computer Engineering Science (CES)

Abstrak

Latent Dirichlet Allocation is a topic-based feature extraction method that uses reasoning to find semantic relationship in corpus. Although Latent Dirichlet Allocation is very powerful in handling very large data sets, but it has a very high complexity along with increasing number of document to reach convergence. Latent Dirichlet Allocation generates probability for all topics in a document, which it contains uncertainty, so its relationship with number of iterations needs to be analyzed. In this paper, Latent Dirichlet Allocation modified by adding fuzzy logic in Gibbs sampling inference algorithm. Its purpose is to analyze the effect of fuzzy logic in handling uncertainty of the occurrence all topics in a document that affect number of iteration in reasoning. Fuzzy-Gibbs Latent Dirichlet Allocation algorithm is implemented on text data of Indonesian documents. Testing performed on three different sizes of data to determine the effect of the number of document to the number of iteration. The algorithm performance was also measured using Perplexity, Precision, Recall and F-Measure.