Classification techniques for drug response microarray data

Nur Shazila Mohamed, Suhaila Zainudin, Zulaiha Ali Othman

Research output: Contribution to journalArticle

Abstract

Understanding the possible toxicity in the human body is crucial in early drug discovery to reduce the risk of unexpected side effects. Microarray data were widely used in disease classification and diagnosis; however the used in drug response is still minimal. The challenges of drug responses microarray data is related to its very high dimensional feature space with relatively small instances. Past researches attempted to develop the drug response classification model employing Nearest Neighbor technique with Minimum Redundancy Maximum Relevance (mRMR) as feature selection however the accuracy is relatively low (63.9%). Therefore, this work aims to apply various machine learning techniques such as Nearest Neighbor (NN), Naïve Bayes, Decision Tree and Support Vector Machine (SVM) on drug responses microarray to determine most efficient technique for classifying microarray drug responses. The experiments were conducted using two data sets which contains 9852 features and 141 features (mRMR selected features) respectively, using 600 numbers of instances. Further experiments conducted for various large numbers of instances to see the performance of the techniques. The result shows that NN and SVMs presented the best classifiers for drug responses. However SVM presented as the best classification techniques for more complex features while NN presented best result for less feature. The results also show the high accuracy is obtained using more large number of instances.

Original languageEnglish
Pages (from-to)13-24
Number of pages12
JournalJournal of Next Generation Information Technology
Volume5
Issue number3
Publication statusPublished - 2014

Fingerprint

Microarrays
Support vector machines
Redundancy
Decision trees
Toxicity
Learning systems
Feature extraction
Classifiers
Experiments

Keywords

  • Classification
  • Data mining
  • Drug discovery
  • Drug response
  • Microarray
  • Toxicity

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Classification techniques for drug response microarray data. / Mohamed, Nur Shazila; Zainudin, Suhaila; Ali Othman, Zulaiha.

In: Journal of Next Generation Information Technology, Vol. 5, No. 3, 2014, p. 13-24.

Research output: Contribution to journalArticle

@article{83a1c431be9c422483def3b9648bae80,
title = "Classification techniques for drug response microarray data",
abstract = "Understanding the possible toxicity in the human body is crucial in early drug discovery to reduce the risk of unexpected side effects. Microarray data were widely used in disease classification and diagnosis; however the used in drug response is still minimal. The challenges of drug responses microarray data is related to its very high dimensional feature space with relatively small instances. Past researches attempted to develop the drug response classification model employing Nearest Neighbor technique with Minimum Redundancy Maximum Relevance (mRMR) as feature selection however the accuracy is relatively low (63.9{\%}). Therefore, this work aims to apply various machine learning techniques such as Nearest Neighbor (NN), Na{\"i}ve Bayes, Decision Tree and Support Vector Machine (SVM) on drug responses microarray to determine most efficient technique for classifying microarray drug responses. The experiments were conducted using two data sets which contains 9852 features and 141 features (mRMR selected features) respectively, using 600 numbers of instances. Further experiments conducted for various large numbers of instances to see the performance of the techniques. The result shows that NN and SVMs presented the best classifiers for drug responses. However SVM presented as the best classification techniques for more complex features while NN presented best result for less feature. The results also show the high accuracy is obtained using more large number of instances.",
keywords = "Classification, Data mining, Drug discovery, Drug response, Microarray, Toxicity",
author = "Mohamed, {Nur Shazila} and Suhaila Zainudin and {Ali Othman}, Zulaiha",
year = "2014",
language = "English",
volume = "5",
pages = "13--24",
journal = "Journal of Next Generation Information Technology",
issn = "2092-8637",
publisher = "Advanced Institute of Convergence Information Technology Research Center",
number = "3",

}

TY - JOUR

T1 - Classification techniques for drug response microarray data

AU - Mohamed, Nur Shazila

AU - Zainudin, Suhaila

AU - Ali Othman, Zulaiha

PY - 2014

Y1 - 2014

N2 - Understanding the possible toxicity in the human body is crucial in early drug discovery to reduce the risk of unexpected side effects. Microarray data were widely used in disease classification and diagnosis; however the used in drug response is still minimal. The challenges of drug responses microarray data is related to its very high dimensional feature space with relatively small instances. Past researches attempted to develop the drug response classification model employing Nearest Neighbor technique with Minimum Redundancy Maximum Relevance (mRMR) as feature selection however the accuracy is relatively low (63.9%). Therefore, this work aims to apply various machine learning techniques such as Nearest Neighbor (NN), Naïve Bayes, Decision Tree and Support Vector Machine (SVM) on drug responses microarray to determine most efficient technique for classifying microarray drug responses. The experiments were conducted using two data sets which contains 9852 features and 141 features (mRMR selected features) respectively, using 600 numbers of instances. Further experiments conducted for various large numbers of instances to see the performance of the techniques. The result shows that NN and SVMs presented the best classifiers for drug responses. However SVM presented as the best classification techniques for more complex features while NN presented best result for less feature. The results also show the high accuracy is obtained using more large number of instances.

AB - Understanding the possible toxicity in the human body is crucial in early drug discovery to reduce the risk of unexpected side effects. Microarray data were widely used in disease classification and diagnosis; however the used in drug response is still minimal. The challenges of drug responses microarray data is related to its very high dimensional feature space with relatively small instances. Past researches attempted to develop the drug response classification model employing Nearest Neighbor technique with Minimum Redundancy Maximum Relevance (mRMR) as feature selection however the accuracy is relatively low (63.9%). Therefore, this work aims to apply various machine learning techniques such as Nearest Neighbor (NN), Naïve Bayes, Decision Tree and Support Vector Machine (SVM) on drug responses microarray to determine most efficient technique for classifying microarray drug responses. The experiments were conducted using two data sets which contains 9852 features and 141 features (mRMR selected features) respectively, using 600 numbers of instances. Further experiments conducted for various large numbers of instances to see the performance of the techniques. The result shows that NN and SVMs presented the best classifiers for drug responses. However SVM presented as the best classification techniques for more complex features while NN presented best result for less feature. The results also show the high accuracy is obtained using more large number of instances.

KW - Classification

KW - Data mining

KW - Drug discovery

KW - Drug response

KW - Microarray

KW - Toxicity

UR - http://www.scopus.com/inward/record.url?scp=84930038333&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84930038333&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84930038333

VL - 5

SP - 13

EP - 24

JO - Journal of Next Generation Information Technology

JF - Journal of Next Generation Information Technology

SN - 2092-8637

IS - 3

ER -