Experimental study of different FSAs in classifying protein function

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper addresses one of the challenges of machine learning in improving performance through feature selection algorithms (FSAs). Application of FSAs in the bioinformatics domain has become a necessity due to enormous growth of public sequence databases. This paper provides an experimental framework on the use of Rough Set Theory (RST) as FSAs in finding minimal feature subsets for classifying protein function. In experimenting RST, three different recent models are explored; Correlation Feature Selection (CFS), FCBF (Fast Correlation-Based Filter) and Artificial Immune System (AIS). The experimental study for these FSAs are based on four criteria: the accuracy (AC), the area under ROC graph (ROC), the length of the reducts (ARL), and the time taken (TT). Classification was performed on the reduced feature set using the Support Vector Machine algorithm. The results demonstrate that CFS and FCBF performs better if the main objectives are to measure the accuracy and ROC, however in terms of duration and rule length, RST is a better choice.

Original languageEnglish
Title of host publicationSoCPaR 2009 - Soft Computing and Pattern Recognition
Pages516-521
Number of pages6
DOIs
Publication statusPublished - 2009
EventInternational Conference on Soft Computing and Pattern Recognition, SoCPaR 2009 - Malacca
Duration: 4 Dec 20097 Dec 2009

Other

OtherInternational Conference on Soft Computing and Pattern Recognition, SoCPaR 2009
CityMalacca
Period4/12/097/12/09

Fingerprint

Feature extraction
Proteins
Rough set theory
Immune system
Bioinformatics
Set theory
Support vector machines
Learning systems

Keywords

  • Classification
  • Feature selection algorithms
  • Protein function
  • Protein sequences

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Rahman, S. A., Mohamed Hussein, Z. A., & Abu Bakar, A. (2009). Experimental study of different FSAs in classifying protein function. In SoCPaR 2009 - Soft Computing and Pattern Recognition (pp. 516-521). [5368659] https://doi.org/10.1109/SoCPaR.2009.104

Experimental study of different FSAs in classifying protein function. / Rahman, Shuzlina Abdul; Mohamed Hussein, Zeti Azura; Abu Bakar, Azuraliza.

SoCPaR 2009 - Soft Computing and Pattern Recognition. 2009. p. 516-521 5368659.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rahman, SA, Mohamed Hussein, ZA & Abu Bakar, A 2009, Experimental study of different FSAs in classifying protein function. in SoCPaR 2009 - Soft Computing and Pattern Recognition., 5368659, pp. 516-521, International Conference on Soft Computing and Pattern Recognition, SoCPaR 2009, Malacca, 4/12/09. https://doi.org/10.1109/SoCPaR.2009.104
Rahman SA, Mohamed Hussein ZA, Abu Bakar A. Experimental study of different FSAs in classifying protein function. In SoCPaR 2009 - Soft Computing and Pattern Recognition. 2009. p. 516-521. 5368659 https://doi.org/10.1109/SoCPaR.2009.104
Rahman, Shuzlina Abdul ; Mohamed Hussein, Zeti Azura ; Abu Bakar, Azuraliza. / Experimental study of different FSAs in classifying protein function. SoCPaR 2009 - Soft Computing and Pattern Recognition. 2009. pp. 516-521
@inproceedings{ce2279eb7a174d75946d786c92f30978,
title = "Experimental study of different FSAs in classifying protein function",
abstract = "This paper addresses one of the challenges of machine learning in improving performance through feature selection algorithms (FSAs). Application of FSAs in the bioinformatics domain has become a necessity due to enormous growth of public sequence databases. This paper provides an experimental framework on the use of Rough Set Theory (RST) as FSAs in finding minimal feature subsets for classifying protein function. In experimenting RST, three different recent models are explored; Correlation Feature Selection (CFS), FCBF (Fast Correlation-Based Filter) and Artificial Immune System (AIS). The experimental study for these FSAs are based on four criteria: the accuracy (AC), the area under ROC graph (ROC), the length of the reducts (ARL), and the time taken (TT). Classification was performed on the reduced feature set using the Support Vector Machine algorithm. The results demonstrate that CFS and FCBF performs better if the main objectives are to measure the accuracy and ROC, however in terms of duration and rule length, RST is a better choice.",
keywords = "Classification, Feature selection algorithms, Protein function, Protein sequences",
author = "Rahman, {Shuzlina Abdul} and {Mohamed Hussein}, {Zeti Azura} and {Abu Bakar}, Azuraliza",
year = "2009",
doi = "10.1109/SoCPaR.2009.104",
language = "English",
isbn = "9780769538792",
pages = "516--521",
booktitle = "SoCPaR 2009 - Soft Computing and Pattern Recognition",

}

TY - GEN

T1 - Experimental study of different FSAs in classifying protein function

AU - Rahman, Shuzlina Abdul

AU - Mohamed Hussein, Zeti Azura

AU - Abu Bakar, Azuraliza

PY - 2009

Y1 - 2009

N2 - This paper addresses one of the challenges of machine learning in improving performance through feature selection algorithms (FSAs). Application of FSAs in the bioinformatics domain has become a necessity due to enormous growth of public sequence databases. This paper provides an experimental framework on the use of Rough Set Theory (RST) as FSAs in finding minimal feature subsets for classifying protein function. In experimenting RST, three different recent models are explored; Correlation Feature Selection (CFS), FCBF (Fast Correlation-Based Filter) and Artificial Immune System (AIS). The experimental study for these FSAs are based on four criteria: the accuracy (AC), the area under ROC graph (ROC), the length of the reducts (ARL), and the time taken (TT). Classification was performed on the reduced feature set using the Support Vector Machine algorithm. The results demonstrate that CFS and FCBF performs better if the main objectives are to measure the accuracy and ROC, however in terms of duration and rule length, RST is a better choice.

AB - This paper addresses one of the challenges of machine learning in improving performance through feature selection algorithms (FSAs). Application of FSAs in the bioinformatics domain has become a necessity due to enormous growth of public sequence databases. This paper provides an experimental framework on the use of Rough Set Theory (RST) as FSAs in finding minimal feature subsets for classifying protein function. In experimenting RST, three different recent models are explored; Correlation Feature Selection (CFS), FCBF (Fast Correlation-Based Filter) and Artificial Immune System (AIS). The experimental study for these FSAs are based on four criteria: the accuracy (AC), the area under ROC graph (ROC), the length of the reducts (ARL), and the time taken (TT). Classification was performed on the reduced feature set using the Support Vector Machine algorithm. The results demonstrate that CFS and FCBF performs better if the main objectives are to measure the accuracy and ROC, however in terms of duration and rule length, RST is a better choice.

KW - Classification

KW - Feature selection algorithms

KW - Protein function

KW - Protein sequences

UR - http://www.scopus.com/inward/record.url?scp=77649324873&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77649324873&partnerID=8YFLogxK

U2 - 10.1109/SoCPaR.2009.104

DO - 10.1109/SoCPaR.2009.104

M3 - Conference contribution

AN - SCOPUS:77649324873

SN - 9780769538792

SP - 516

EP - 521

BT - SoCPaR 2009 - Soft Computing and Pattern Recognition

ER -