Metaheuristic approach for an enhanced mRMR filter method for classification using drug response microarray data

Nur Shazila Mohamed, Suhaila Zainudin, Zulaiha Ali Othman

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Quality data mining analysis based on microarray gene expression data is a good approach for disease classification and other fields, such as pharmacology, as well as a useful tool for medical innovation. One of the challenges in classification is that microarrays involve high dimensionality and a large number of redundant and irrelevant features. Feature selection is the most popular method for determining the optimal number of features that will be used for classification. Feature selection is important to accelerate learning, which is represented only by the optimal feature subset. The current approach for microarray feature selection for the filter method is to simply select the top-ranked genes, i.e., keeping the 50 or 100 best-ranked genes. However, the current approach is determined by human intuition; it requires trial and error, and thus, is time-consuming. Accordingly, this study aims to propose a metaheuristic approach for selecting the top n relevant genes in drug microarray data to enhance the minimum redundancy–maximum relevance (mRMR) filter method. Three metaheuristics are applied, namely, particle swarm optimization (PSO), cuckoo search (CS), and artificial bee colony (ABC). Subsequently, k-nearest neighbor and support vector machine are used as classifiers to evaluate classification performance. The experiment used a microarray gene dataset of liver xenobiotic and pharmacological responses. Experimental results show that meta-heuristic is more efficient approaches that have reduced the complexity of the classifier. Furthermore, the results show that mRMR-CS exhibits the best performance compared with mRMR-PSO and mRMR-ABC.

Original languageEnglish
Pages (from-to)224-231
Number of pages8
JournalExpert Systems with Applications
Volume90
DOIs
Publication statusPublished - 30 Dec 2017

Fingerprint

Microarrays
Genes
Feature extraction
Particle swarm optimization (PSO)
Classifiers
Gene expression
Liver
Support vector machines
Data mining
Innovation
Experiments

Keywords

  • Classification
  • Data mining
  • Feature selection
  • Filter
  • Microarray

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this

@article{e886f3072bec42c0a0b58653cf9a37c8,
title = "Metaheuristic approach for an enhanced mRMR filter method for classification using drug response microarray data",
abstract = "Quality data mining analysis based on microarray gene expression data is a good approach for disease classification and other fields, such as pharmacology, as well as a useful tool for medical innovation. One of the challenges in classification is that microarrays involve high dimensionality and a large number of redundant and irrelevant features. Feature selection is the most popular method for determining the optimal number of features that will be used for classification. Feature selection is important to accelerate learning, which is represented only by the optimal feature subset. The current approach for microarray feature selection for the filter method is to simply select the top-ranked genes, i.e., keeping the 50 or 100 best-ranked genes. However, the current approach is determined by human intuition; it requires trial and error, and thus, is time-consuming. Accordingly, this study aims to propose a metaheuristic approach for selecting the top n relevant genes in drug microarray data to enhance the minimum redundancy–maximum relevance (mRMR) filter method. Three metaheuristics are applied, namely, particle swarm optimization (PSO), cuckoo search (CS), and artificial bee colony (ABC). Subsequently, k-nearest neighbor and support vector machine are used as classifiers to evaluate classification performance. The experiment used a microarray gene dataset of liver xenobiotic and pharmacological responses. Experimental results show that meta-heuristic is more efficient approaches that have reduced the complexity of the classifier. Furthermore, the results show that mRMR-CS exhibits the best performance compared with mRMR-PSO and mRMR-ABC.",
keywords = "Classification, Data mining, Feature selection, Filter, Microarray",
author = "Mohamed, {Nur Shazila} and Suhaila Zainudin and {Ali Othman}, Zulaiha",
year = "2017",
month = "12",
day = "30",
doi = "10.1016/j.eswa.2017.08.026",
language = "English",
volume = "90",
pages = "224--231",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Metaheuristic approach for an enhanced mRMR filter method for classification using drug response microarray data

AU - Mohamed, Nur Shazila

AU - Zainudin, Suhaila

AU - Ali Othman, Zulaiha

PY - 2017/12/30

Y1 - 2017/12/30

N2 - Quality data mining analysis based on microarray gene expression data is a good approach for disease classification and other fields, such as pharmacology, as well as a useful tool for medical innovation. One of the challenges in classification is that microarrays involve high dimensionality and a large number of redundant and irrelevant features. Feature selection is the most popular method for determining the optimal number of features that will be used for classification. Feature selection is important to accelerate learning, which is represented only by the optimal feature subset. The current approach for microarray feature selection for the filter method is to simply select the top-ranked genes, i.e., keeping the 50 or 100 best-ranked genes. However, the current approach is determined by human intuition; it requires trial and error, and thus, is time-consuming. Accordingly, this study aims to propose a metaheuristic approach for selecting the top n relevant genes in drug microarray data to enhance the minimum redundancy–maximum relevance (mRMR) filter method. Three metaheuristics are applied, namely, particle swarm optimization (PSO), cuckoo search (CS), and artificial bee colony (ABC). Subsequently, k-nearest neighbor and support vector machine are used as classifiers to evaluate classification performance. The experiment used a microarray gene dataset of liver xenobiotic and pharmacological responses. Experimental results show that meta-heuristic is more efficient approaches that have reduced the complexity of the classifier. Furthermore, the results show that mRMR-CS exhibits the best performance compared with mRMR-PSO and mRMR-ABC.

AB - Quality data mining analysis based on microarray gene expression data is a good approach for disease classification and other fields, such as pharmacology, as well as a useful tool for medical innovation. One of the challenges in classification is that microarrays involve high dimensionality and a large number of redundant and irrelevant features. Feature selection is the most popular method for determining the optimal number of features that will be used for classification. Feature selection is important to accelerate learning, which is represented only by the optimal feature subset. The current approach for microarray feature selection for the filter method is to simply select the top-ranked genes, i.e., keeping the 50 or 100 best-ranked genes. However, the current approach is determined by human intuition; it requires trial and error, and thus, is time-consuming. Accordingly, this study aims to propose a metaheuristic approach for selecting the top n relevant genes in drug microarray data to enhance the minimum redundancy–maximum relevance (mRMR) filter method. Three metaheuristics are applied, namely, particle swarm optimization (PSO), cuckoo search (CS), and artificial bee colony (ABC). Subsequently, k-nearest neighbor and support vector machine are used as classifiers to evaluate classification performance. The experiment used a microarray gene dataset of liver xenobiotic and pharmacological responses. Experimental results show that meta-heuristic is more efficient approaches that have reduced the complexity of the classifier. Furthermore, the results show that mRMR-CS exhibits the best performance compared with mRMR-PSO and mRMR-ABC.

KW - Classification

KW - Data mining

KW - Feature selection

KW - Filter

KW - Microarray

UR - http://www.scopus.com/inward/record.url?scp=85028024950&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028024950&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2017.08.026

DO - 10.1016/j.eswa.2017.08.026

M3 - Article

VL - 90

SP - 224

EP - 231

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

ER -