Statistical validation of aco-knn algorithm for sentiment analysis

Siti Rohaidah Ahmad, Azuraliza Abu Bakar, Mohd Ridzwan Yaakub, Nurhafizah Moziyana Mohd Yusop

Research output: Contribution to journalArticle

Abstract

This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbour (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, the IG-GA, and the IG-RSAR algorithms. The dependency relation algorithm was used to identify actual features commented by customers by linking the dependency relation between product feature and sentiment words in customers sentences. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which was validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

Original languageEnglish
Pages (from-to)165-170
Number of pages6
JournalJournal of Telecommunication, Electronic and Computer Engineering
Volume9
Issue number2-11
Publication statusPublished - 2017

Fingerprint

Ant colony optimization
Feature extraction
Genetic algorithms
Statistical tests

Keywords

  • Ant Colony Optimization.
  • Feature Selection
  • Sentiment Analysis
  • Statistical Analysis

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Cite this

Statistical validation of aco-knn algorithm for sentiment analysis. / Ahmad, Siti Rohaidah; Abu Bakar, Azuraliza; Yaakub, Mohd Ridzwan; Yusop, Nurhafizah Moziyana Mohd.

In: Journal of Telecommunication, Electronic and Computer Engineering, Vol. 9, No. 2-11, 2017, p. 165-170.

Research output: Contribution to journalArticle

@article{39085940121d4500b42e1cec8d06d692,
title = "Statistical validation of aco-knn algorithm for sentiment analysis",
abstract = "This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbour (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, the IG-GA, and the IG-RSAR algorithms. The dependency relation algorithm was used to identify actual features commented by customers by linking the dependency relation between product feature and sentiment words in customers sentences. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which was validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.",
keywords = "Ant Colony Optimization., Feature Selection, Sentiment Analysis, Statistical Analysis",
author = "Ahmad, {Siti Rohaidah} and {Abu Bakar}, Azuraliza and Yaakub, {Mohd Ridzwan} and Yusop, {Nurhafizah Moziyana Mohd}",
year = "2017",
language = "English",
volume = "9",
pages = "165--170",
journal = "Journal of Telecommunication, Electronic and Computer Engineering",
issn = "2180-1843",
publisher = "Universiti Teknikal Malaysia Melaka",
number = "2-11",

}

TY - JOUR

T1 - Statistical validation of aco-knn algorithm for sentiment analysis

AU - Ahmad, Siti Rohaidah

AU - Abu Bakar, Azuraliza

AU - Yaakub, Mohd Ridzwan

AU - Yusop, Nurhafizah Moziyana Mohd

PY - 2017

Y1 - 2017

N2 - This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbour (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, the IG-GA, and the IG-RSAR algorithms. The dependency relation algorithm was used to identify actual features commented by customers by linking the dependency relation between product feature and sentiment words in customers sentences. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which was validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

AB - This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbour (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, the IG-GA, and the IG-RSAR algorithms. The dependency relation algorithm was used to identify actual features commented by customers by linking the dependency relation between product feature and sentiment words in customers sentences. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which was validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

KW - Ant Colony Optimization.

KW - Feature Selection

KW - Sentiment Analysis

KW - Statistical Analysis

UR - http://www.scopus.com/inward/record.url?scp=85032796626&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032796626&partnerID=8YFLogxK

M3 - Article

VL - 9

SP - 165

EP - 170

JO - Journal of Telecommunication, Electronic and Computer Engineering

JF - Journal of Telecommunication, Electronic and Computer Engineering

SN - 2180-1843

IS - 2-11

ER -