Abstract
This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.
Original language | English |
---|---|
Pages (from-to) | 1016-1022 |
Number of pages | 7 |
Journal | Information Technology Journal |
Volume | 7 |
Issue number | 7 |
DOIs | |
Publication status | Published - 2008 |
Fingerprint
Keywords
- Document classification
- Information retrieval
- Naïve Bayesian algorithm
ASJC Scopus subject areas
- Computer Science (miscellaneous)
Cite this
Automatic classifications of malay proverbs using Naïve Bayesian Algorithm. / Mohd Noah, Shahrul Azman; Ismail, Fuad.
In: Information Technology Journal, Vol. 7, No. 7, 2008, p. 1016-1022.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - Automatic classifications of malay proverbs using Naïve Bayesian Algorithm
AU - Mohd Noah, Shahrul Azman
AU - Ismail, Fuad
PY - 2008
Y1 - 2008
N2 - This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.
AB - This research presented an experimental study on automatic classification of Malay proverbs using Naïve Bayesian algorithm. The automatic classification tasks were implemented using two Bayesian models, multinomial and multivariate Bernoulli model. Both models were calibrated using one thousand training and testing dataset which were classified into five categories: family, life, destiny, social and knowledge. Two types of testing have been conducted; testing on dataset with stop words and dataset with no stop words by using three cases of Malay proverbs, i.e., proverb alone, proverb with meaning and proverb with the meaning and example sentences. The intuition was that, since proverbs were commonly short statement, the inclusion of its meaning and associated used in sentences could improve the accuracy of classification. The results showed that a maximum of 72.2 and 68.2% of accuracy have been achieved respectively by the Multinomial model and the Multivariate Bernoulli for the dataset with no stop words using proverb with the meaning and example sentences. This experiment has indicated the capability of the Naïv e Bayesian algorithm in performing proverbs classification particularly with the inclusion of meaning and example usage of such proverbs.
KW - Document classification
KW - Information retrieval
KW - Naïve Bayesian algorithm
UR - http://www.scopus.com/inward/record.url?scp=56749125113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=56749125113&partnerID=8YFLogxK
U2 - 10.3923/itj.2008.1016.1022
DO - 10.3923/itj.2008.1016.1022
M3 - Article
AN - SCOPUS:56749125113
VL - 7
SP - 1016
EP - 1022
JO - Information Technology Journal
JF - Information Technology Journal
SN - 1812-5638
IS - 7
ER -