Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection

Salam Salameh Shreem, Salwani Abdullah, Mohd Zakree Ahmad Nazri, Malek Alzaqebah

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Gene expression data comprises a huge number of genes but have only few samples that can be used to address supervised classification problems. This paper is aimed at identifying a small set of genes, to efficiently distinguish various types of biological sample; hence we have proposed a three-stage of gene selection algorithm for genomic data. The proposed approach combines ReliefF, mRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm) coded as (R-m-GA). In the first stage, the candidate gene set is identified by applying the ReliefF. While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from the candidate set. In the third stage, GA with classifier (used as a fitness function by the GA) is applied to choose the most discriminating genes. The proposed method is validated on the tumor datasets such as: CNS, DLBCL and Prostate cancer, using IB1 classifier. The comparative analysis of the R-m-GA against GA and ReliefF-GA has revealed that the proposed method is capable of finding the smallest gene subset that offers the highest classification accuracy.

Original languageEnglish
Pages (from-to)1034-1039
Number of pages6
JournalJournal of Theoretical and Applied Information Technology
Volume46
Issue number2
Publication statusPublished - 2012

Fingerprint

Gene Selection
Wrapper
Redundancy
Genes
Genetic algorithms
Genetic Algorithm
Filter
Gene
Classifiers
Classifier
Prostate Cancer
Subset
Supervised Classification
Fitness Function
Gene Expression Data
Relevance
Comparative Analysis
Gene expression
Classification Problems
Genomics

Keywords

  • Gene selection
  • Genetic algorithm
  • Microarray datasets
  • mRMR
  • Relief

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection. / Shreem, Salam Salameh; Abdullah, Salwani; Ahmad Nazri, Mohd Zakree; Alzaqebah, Malek.

In: Journal of Theoretical and Applied Information Technology, Vol. 46, No. 2, 2012, p. 1034-1039.

Research output: Contribution to journalArticle

@article{16eb993675634725aa4c26ccfa6e1991,
title = "Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection",
abstract = "Gene expression data comprises a huge number of genes but have only few samples that can be used to address supervised classification problems. This paper is aimed at identifying a small set of genes, to efficiently distinguish various types of biological sample; hence we have proposed a three-stage of gene selection algorithm for genomic data. The proposed approach combines ReliefF, mRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm) coded as (R-m-GA). In the first stage, the candidate gene set is identified by applying the ReliefF. While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from the candidate set. In the third stage, GA with classifier (used as a fitness function by the GA) is applied to choose the most discriminating genes. The proposed method is validated on the tumor datasets such as: CNS, DLBCL and Prostate cancer, using IB1 classifier. The comparative analysis of the R-m-GA against GA and ReliefF-GA has revealed that the proposed method is capable of finding the smallest gene subset that offers the highest classification accuracy.",
keywords = "Gene selection, Genetic algorithm, Microarray datasets, mRMR, Relief",
author = "Shreem, {Salam Salameh} and Salwani Abdullah and {Ahmad Nazri}, {Mohd Zakree} and Malek Alzaqebah",
year = "2012",
language = "English",
volume = "46",
pages = "1034--1039",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "2",

}

TY - JOUR

T1 - Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection

AU - Shreem, Salam Salameh

AU - Abdullah, Salwani

AU - Ahmad Nazri, Mohd Zakree

AU - Alzaqebah, Malek

PY - 2012

Y1 - 2012

N2 - Gene expression data comprises a huge number of genes but have only few samples that can be used to address supervised classification problems. This paper is aimed at identifying a small set of genes, to efficiently distinguish various types of biological sample; hence we have proposed a three-stage of gene selection algorithm for genomic data. The proposed approach combines ReliefF, mRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm) coded as (R-m-GA). In the first stage, the candidate gene set is identified by applying the ReliefF. While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from the candidate set. In the third stage, GA with classifier (used as a fitness function by the GA) is applied to choose the most discriminating genes. The proposed method is validated on the tumor datasets such as: CNS, DLBCL and Prostate cancer, using IB1 classifier. The comparative analysis of the R-m-GA against GA and ReliefF-GA has revealed that the proposed method is capable of finding the smallest gene subset that offers the highest classification accuracy.

AB - Gene expression data comprises a huge number of genes but have only few samples that can be used to address supervised classification problems. This paper is aimed at identifying a small set of genes, to efficiently distinguish various types of biological sample; hence we have proposed a three-stage of gene selection algorithm for genomic data. The proposed approach combines ReliefF, mRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm) coded as (R-m-GA). In the first stage, the candidate gene set is identified by applying the ReliefF. While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from the candidate set. In the third stage, GA with classifier (used as a fitness function by the GA) is applied to choose the most discriminating genes. The proposed method is validated on the tumor datasets such as: CNS, DLBCL and Prostate cancer, using IB1 classifier. The comparative analysis of the R-m-GA against GA and ReliefF-GA has revealed that the proposed method is capable of finding the smallest gene subset that offers the highest classification accuracy.

KW - Gene selection

KW - Genetic algorithm

KW - Microarray datasets

KW - mRMR

KW - Relief

UR - http://www.scopus.com/inward/record.url?scp=84872010628&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872010628&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84872010628

VL - 46

SP - 1034

EP - 1039

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 2

ER -