Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis

Siti Rohaidah Ahmad, Nurhafizah Moziyana Mohd Yusop, Azuraliza Abu Bakar, Mohd Ridzwan Yaakub

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

Original languageEnglish
Title of host publication2nd International Conference on Applied Science and Technology 2017, ICAST 2017
PublisherAmerican Institute of Physics Inc.
Volume1891
ISBN (Electronic)9780735415737
DOIs
Publication statusPublished - 3 Oct 2017
Event2nd International Conference on Applied Science and Technology 2017, ICAST 2017 - Langkawi, Kedah, Malaysia
Duration: 3 Apr 20175 Apr 2017

Other

Other2nd International Conference on Applied Science and Technology 2017, ICAST 2017
CountryMalaysia
CityLangkawi, Kedah
Period3/4/175/4/17

Fingerprint

statistical analysis
optimization
genetic algorithms
evaluation
set theory

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Cite this

Ahmad, S. R., Yusop, N. M. M., Abu Bakar, A., & Yaakub, M. R. (2017). Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. In 2nd International Conference on Applied Science and Technology 2017, ICAST 2017 (Vol. 1891). [020018] American Institute of Physics Inc.. https://doi.org/10.1063/1.5005351

Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. / Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Abu Bakar, Azuraliza; Yaakub, Mohd Ridzwan.

2nd International Conference on Applied Science and Technology 2017, ICAST 2017. Vol. 1891 American Institute of Physics Inc., 2017. 020018.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmad, SR, Yusop, NMM, Abu Bakar, A & Yaakub, MR 2017, Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. in 2nd International Conference on Applied Science and Technology 2017, ICAST 2017. vol. 1891, 020018, American Institute of Physics Inc., 2nd International Conference on Applied Science and Technology 2017, ICAST 2017, Langkawi, Kedah, Malaysia, 3/4/17. https://doi.org/10.1063/1.5005351
Ahmad SR, Yusop NMM, Abu Bakar A, Yaakub MR. Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. In 2nd International Conference on Applied Science and Technology 2017, ICAST 2017. Vol. 1891. American Institute of Physics Inc. 2017. 020018 https://doi.org/10.1063/1.5005351
Ahmad, Siti Rohaidah ; Yusop, Nurhafizah Moziyana Mohd ; Abu Bakar, Azuraliza ; Yaakub, Mohd Ridzwan. / Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. 2nd International Conference on Applied Science and Technology 2017, ICAST 2017. Vol. 1891 American Institute of Physics Inc., 2017.
@inproceedings{b51767dc8c774c45a1e1e9fb21e77d0c,
title = "Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis",
abstract = "This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.",
author = "Ahmad, {Siti Rohaidah} and Yusop, {Nurhafizah Moziyana Mohd} and {Abu Bakar}, Azuraliza and Yaakub, {Mohd Ridzwan}",
year = "2017",
month = "10",
day = "3",
doi = "10.1063/1.5005351",
language = "English",
volume = "1891",
booktitle = "2nd International Conference on Applied Science and Technology 2017, ICAST 2017",
publisher = "American Institute of Physics Inc.",

}

TY - GEN

T1 - Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis

AU - Ahmad, Siti Rohaidah

AU - Yusop, Nurhafizah Moziyana Mohd

AU - Abu Bakar, Azuraliza

AU - Yaakub, Mohd Ridzwan

PY - 2017/10/3

Y1 - 2017/10/3

N2 - This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

AB - This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

UR - http://www.scopus.com/inward/record.url?scp=85031298902&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031298902&partnerID=8YFLogxK

U2 - 10.1063/1.5005351

DO - 10.1063/1.5005351

M3 - Conference contribution

AN - SCOPUS:85031298902

VL - 1891

BT - 2nd International Conference on Applied Science and Technology 2017, ICAST 2017

PB - American Institute of Physics Inc.

ER -