Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach

Abdullah S. Ghareb, Abdul Razak Hamdan, Azuraliza Abu Bakar

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency–inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.

Original languageEnglish
Pages (from-to)7807-7822
Number of pages16
JournalArabian Journal for Science and Engineering
Volume39
Issue number11
DOIs
Publication statusPublished - 25 Oct 2014

Fingerprint

Feature extraction
Classifiers
Association rules
Testing
Experiments

Keywords

  • Arabic text
  • Associative classification
  • Category association rule
  • Feature ranking
  • Feature selection
  • Noun extraction

ASJC Scopus subject areas

  • General

Cite this

Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach. / Ghareb, Abdullah S.; Hamdan, Abdul Razak; Abu Bakar, Azuraliza.

In: Arabian Journal for Science and Engineering, Vol. 39, No. 11, 25.10.2014, p. 7807-7822.

Research output: Contribution to journalArticle

@article{1ba4b2022c9a44d5b1457003384e6f6b,
title = "Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach",
abstract = "Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency–inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.",
keywords = "Arabic text, Associative classification, Category association rule, Feature ranking, Feature selection, Noun extraction",
author = "Ghareb, {Abdullah S.} and Hamdan, {Abdul Razak} and {Abu Bakar}, Azuraliza",
year = "2014",
month = "10",
day = "25",
doi = "10.1007/s13369-014-1304-3",
language = "English",
volume = "39",
pages = "7807--7822",
journal = "Arabian Journal for Science and Engineering",
issn = "1319-8025",
publisher = "King Fahd University of Petroleum and Minerals",
number = "11",

}

TY - JOUR

T1 - Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach

AU - Ghareb, Abdullah S.

AU - Hamdan, Abdul Razak

AU - Abu Bakar, Azuraliza

PY - 2014/10/25

Y1 - 2014/10/25

N2 - Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency–inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.

AB - Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency–inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.

KW - Arabic text

KW - Associative classification

KW - Category association rule

KW - Feature ranking

KW - Feature selection

KW - Noun extraction

UR - http://www.scopus.com/inward/record.url?scp=84909989533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84909989533&partnerID=8YFLogxK

U2 - 10.1007/s13369-014-1304-3

DO - 10.1007/s13369-014-1304-3

M3 - Article

AN - SCOPUS:84909989533

VL - 39

SP - 7807

EP - 7822

JO - Arabian Journal for Science and Engineering

JF - Arabian Journal for Science and Engineering

SN - 1319-8025

IS - 11

ER -