Automatic classifications of malay proverbs using Naïve Bayesian Algorithm

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.

Original languageEnglish
Pages (from-to)1016-1022
Number of pages7
JournalInformation Technology Journal
Volume7
Issue number7
DOIs
Publication statusPublished - 2008

Fingerprint

Testing
Experiments

Keywords

  • Document classification
  • Information retrieval
  • Naïve Bayesian algorithm

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

Automatic classifications of malay proverbs using Naïve Bayesian Algorithm. / Mohd Noah, Shahrul Azman; Ismail, Fuad.

In: Information Technology Journal, Vol. 7, No. 7, 2008, p. 1016-1022.

Research output: Contribution to journalArticle

@article{6b9fd252fa33483f8f6290f5de98144d,
title = "Automatic classifications of malay proverbs using Na{\"i}ve Bayesian Algorithm",
abstract = "This research presented an experimental study on automatic classification of Malay proverbs using Na{\"i}ve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2{\%} of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Na{\"i}v e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.",
keywords = "Document classification, Information retrieval, Na{\"i}ve Bayesian algorithm",
author = "{Mohd Noah}, {Shahrul Azman} and Fuad Ismail",
year = "2008",
doi = "10.3923/itj.2008.1016.1022",
language = "English",
volume = "7",
pages = "1016--1022",
journal = "Information Technology Journal",
issn = "1812-5638",
publisher = "Asian Network for Scientific Information",
number = "7",

}

TY - JOUR

T1 - Automatic classifications of malay proverbs using Naïve Bayesian Algorithm

AU - Mohd Noah, Shahrul Azman

AU - Ismail, Fuad

PY - 2008

Y1 - 2008

N2 - This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.

AB - This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.

KW - Document classification

KW - Information retrieval

KW - Naïve Bayesian algorithm

UR - http://www.scopus.com/inward/record.url?scp=56749125113&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56749125113&partnerID=8YFLogxK

U2 - 10.3923/itj.2008.1016.1022

DO - 10.3923/itj.2008.1016.1022

M3 - Article

VL - 7

SP - 1016

EP - 1022

JO - Information Technology Journal

JF - Information Technology Journal

SN - 1812-5638

IS - 7

ER -