A hybrid method using lexicon-based approach and Naive Bayes classifier for Arabic opinion question answering

Khalid Khalifa, Nazlia Omar

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Naïve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexiconbased approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Naïve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91% accuracy.

Original languageEnglish
Pages (from-to)1961-1968
Number of pages8
JournalJournal of Computer Science
Volume10
Issue number10
DOIs
Publication statusPublished - 2014

Fingerprint

Classifiers
Support vector machines
Thesauri
Glossaries
Information retrieval
Processing

Keywords

  • Lexicon-based
  • Naïve bayes
  • Opinion question answering
  • Sentiment analysis

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

A hybrid method using lexicon-based approach and Naive Bayes classifier for Arabic opinion question answering. / Khalifa, Khalid; Omar, Nazlia.

In: Journal of Computer Science, Vol. 10, No. 10, 2014, p. 1961-1968.

Research output: Contribution to journalArticle

@article{a4b88b0766bb4a1ebcbc2a0aa2d2aced,
title = "A hybrid method using lexicon-based approach and Naive Bayes classifier for Arabic opinion question answering",
abstract = "Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Na{\"i}ve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexiconbased approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Na{\"i}ve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Na{\"i}ve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91{\%} accuracy.",
keywords = "Lexicon-based, Na{\"i}ve bayes, Opinion question answering, Sentiment analysis",
author = "Khalid Khalifa and Nazlia Omar",
year = "2014",
doi = "10.3844/jcssp.2014.1961.1968",
language = "English",
volume = "10",
pages = "1961--1968",
journal = "Journal of Computer Science",
issn = "1549-3636",
publisher = "Science Publications",
number = "10",

}

TY - JOUR

T1 - A hybrid method using lexicon-based approach and Naive Bayes classifier for Arabic opinion question answering

AU - Khalifa, Khalid

AU - Omar, Nazlia

PY - 2014

Y1 - 2014

N2 - Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Naïve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexiconbased approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Naïve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91% accuracy.

AB - Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Naïve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexiconbased approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Naïve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91% accuracy.

KW - Lexicon-based

KW - Naïve bayes

KW - Opinion question answering

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=84905233001&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905233001&partnerID=8YFLogxK

U2 - 10.3844/jcssp.2014.1961.1968

DO - 10.3844/jcssp.2014.1961.1968

M3 - Article

VL - 10

SP - 1961

EP - 1968

JO - Journal of Computer Science

JF - Journal of Computer Science

SN - 1549-3636

IS - 10

ER -