Feature transfer through new statistical association measure for cross-domain sentiment analysis

Tareq Al-Moslmi, Nazlia Omar, Mohammed Albared, Adel Al-Shabi

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

With the outgrowth of user-based web content, individuals can freely express their opinion in many domains. However, this would imply a huge cost to annotate training data for a large number of domains and prevent us from exploiting the information shared across various domains. As a result, cross-domain sentiment analysis is a challenging NLP task due to feature divergence and polarity divergence. However, to tackle this issue, this study presents a new model for cross-domain sentiment classification. This model is based on transferring features between source and target domains vice versa, using a Union of Conditional Probability (UCP) association measure. A Naive Bayes (NB) classifier and three feature selection methods (Information gain, Odd ratio, Chi-square) are used to evaluate the proposed model. Experimental results show that our model's results were very promising and encourages us to further pursue this research.

Original languageEnglish
Pages (from-to)164-170
Number of pages7
JournalJournal of Engineering and Applied Sciences
Volume12
Issue number1
DOIs
Publication statusPublished - 2017

Fingerprint

Feature extraction
Classifiers
Costs

Keywords

  • Co-occurrence calculation methods
  • Cross-domain sentiment analysis
  • Malaysia
  • Sentiment analysis
  • Sentiment thesaurus

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Feature transfer through new statistical association measure for cross-domain sentiment analysis. / Al-Moslmi, Tareq; Omar, Nazlia; Albared, Mohammed; Al-Shabi, Adel.

In: Journal of Engineering and Applied Sciences, Vol. 12, No. 1, 2017, p. 164-170.

Research output: Contribution to journalArticle

@article{147c5dfe839947e3adee84b5ed413137,
title = "Feature transfer through new statistical association measure for cross-domain sentiment analysis",
abstract = "With the outgrowth of user-based web content, individuals can freely express their opinion in many domains. However, this would imply a huge cost to annotate training data for a large number of domains and prevent us from exploiting the information shared across various domains. As a result, cross-domain sentiment analysis is a challenging NLP task due to feature divergence and polarity divergence. However, to tackle this issue, this study presents a new model for cross-domain sentiment classification. This model is based on transferring features between source and target domains vice versa, using a Union of Conditional Probability (UCP) association measure. A Naive Bayes (NB) classifier and three feature selection methods (Information gain, Odd ratio, Chi-square) are used to evaluate the proposed model. Experimental results show that our model's results were very promising and encourages us to further pursue this research.",
keywords = "Co-occurrence calculation methods, Cross-domain sentiment analysis, Malaysia, Sentiment analysis, Sentiment thesaurus",
author = "Tareq Al-Moslmi and Nazlia Omar and Mohammed Albared and Adel Al-Shabi",
year = "2017",
doi = "10.3923/jeasci.2017.164.170",
language = "English",
volume = "12",
pages = "164--170",
journal = "Journal of Engineering and Applied Sciences",
issn = "1816-949X",
publisher = "Medwell Journals",
number = "1",

}

TY - JOUR

T1 - Feature transfer through new statistical association measure for cross-domain sentiment analysis

AU - Al-Moslmi, Tareq

AU - Omar, Nazlia

AU - Albared, Mohammed

AU - Al-Shabi, Adel

PY - 2017

Y1 - 2017

N2 - With the outgrowth of user-based web content, individuals can freely express their opinion in many domains. However, this would imply a huge cost to annotate training data for a large number of domains and prevent us from exploiting the information shared across various domains. As a result, cross-domain sentiment analysis is a challenging NLP task due to feature divergence and polarity divergence. However, to tackle this issue, this study presents a new model for cross-domain sentiment classification. This model is based on transferring features between source and target domains vice versa, using a Union of Conditional Probability (UCP) association measure. A Naive Bayes (NB) classifier and three feature selection methods (Information gain, Odd ratio, Chi-square) are used to evaluate the proposed model. Experimental results show that our model's results were very promising and encourages us to further pursue this research.

AB - With the outgrowth of user-based web content, individuals can freely express their opinion in many domains. However, this would imply a huge cost to annotate training data for a large number of domains and prevent us from exploiting the information shared across various domains. As a result, cross-domain sentiment analysis is a challenging NLP task due to feature divergence and polarity divergence. However, to tackle this issue, this study presents a new model for cross-domain sentiment classification. This model is based on transferring features between source and target domains vice versa, using a Union of Conditional Probability (UCP) association measure. A Naive Bayes (NB) classifier and three feature selection methods (Information gain, Odd ratio, Chi-square) are used to evaluate the proposed model. Experimental results show that our model's results were very promising and encourages us to further pursue this research.

KW - Co-occurrence calculation methods

KW - Cross-domain sentiment analysis

KW - Malaysia

KW - Sentiment analysis

KW - Sentiment thesaurus

UR - http://www.scopus.com/inward/record.url?scp=85014806815&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014806815&partnerID=8YFLogxK

U2 - 10.3923/jeasci.2017.164.170

DO - 10.3923/jeasci.2017.164.170

M3 - Article

AN - SCOPUS:85014806815

VL - 12

SP - 164

EP - 170

JO - Journal of Engineering and Applied Sciences

JF - Journal of Engineering and Applied Sciences

SN - 1816-949X

IS - 1

ER -