Data mining framework for protein function prediction

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Determining the functions of uncharacterized proteins from sequences remains a challenge despite the growth of the number of prediction methods. This is due to the nature of the inherent limitations of current tools and databases and the ambiguity of the function definition. Additionally, standard methods of functional assignment involve sequence alignment to a gene function often fail to find the significant matches. This paper proposes a framework of machine learning method in predicting protein function irrespective of sequence similarity. The framework aims to provide a workflow on predicting protein function that combines both data mining and machine learning algorithms. Three main components are involved: pre-processing, model development and testing & evaluation. The study is expected to create a new method on feature selection processes towards predicting protein functional classes in addition to complementing the existing conventional method of functional assignment.

Original languageEnglish
Title of host publicationProceedings - International Symposium on Information Technology 2008, ITSim
Volume2
DOIs
Publication statusPublished - 2008
EventInternational Symposium on Information Technology 2008, ITSim - Kuala Lumpur
Duration: 26 Aug 200829 Aug 2008

Other

OtherInternational Symposium on Information Technology 2008, ITSim
CityKuala Lumpur
Period26/8/0829/8/08

Fingerprint

Data mining
Proteins
Learning systems
Learning algorithms
Feature extraction
Genes
Testing
Processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Rahman, S. A., Mohamed Hussein, Z. A., & Abu Bakar, A. (2008). Data mining framework for protein function prediction. In Proceedings - International Symposium on Information Technology 2008, ITSim (Vol. 2). [4631683] https://doi.org/10.1109/ITSIM.2008.4631683

Data mining framework for protein function prediction. / Rahman, Shuzlina Abdul; Mohamed Hussein, Zeti Azura; Abu Bakar, Azuraliza.

Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008. 4631683.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rahman, SA, Mohamed Hussein, ZA & Abu Bakar, A 2008, Data mining framework for protein function prediction. in Proceedings - International Symposium on Information Technology 2008, ITSim. vol. 2, 4631683, International Symposium on Information Technology 2008, ITSim, Kuala Lumpur, 26/8/08. https://doi.org/10.1109/ITSIM.2008.4631683
Rahman SA, Mohamed Hussein ZA, Abu Bakar A. Data mining framework for protein function prediction. In Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2. 2008. 4631683 https://doi.org/10.1109/ITSIM.2008.4631683
Rahman, Shuzlina Abdul ; Mohamed Hussein, Zeti Azura ; Abu Bakar, Azuraliza. / Data mining framework for protein function prediction. Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008.
@inproceedings{0abc8a487bf7464bb9b27380f4be7aa4,
title = "Data mining framework for protein function prediction",
abstract = "Determining the functions of uncharacterized proteins from sequences remains a challenge despite the growth of the number of prediction methods. This is due to the nature of the inherent limitations of current tools and databases and the ambiguity of the function definition. Additionally, standard methods of functional assignment involve sequence alignment to a gene function often fail to find the significant matches. This paper proposes a framework of machine learning method in predicting protein function irrespective of sequence similarity. The framework aims to provide a workflow on predicting protein function that combines both data mining and machine learning algorithms. Three main components are involved: pre-processing, model development and testing & evaluation. The study is expected to create a new method on feature selection processes towards predicting protein functional classes in addition to complementing the existing conventional method of functional assignment.",
author = "Rahman, {Shuzlina Abdul} and {Mohamed Hussein}, {Zeti Azura} and {Abu Bakar}, Azuraliza",
year = "2008",
doi = "10.1109/ITSIM.2008.4631683",
language = "English",
isbn = "9781424423286",
volume = "2",
booktitle = "Proceedings - International Symposium on Information Technology 2008, ITSim",

}

TY - GEN

T1 - Data mining framework for protein function prediction

AU - Rahman, Shuzlina Abdul

AU - Mohamed Hussein, Zeti Azura

AU - Abu Bakar, Azuraliza

PY - 2008

Y1 - 2008

N2 - Determining the functions of uncharacterized proteins from sequences remains a challenge despite the growth of the number of prediction methods. This is due to the nature of the inherent limitations of current tools and databases and the ambiguity of the function definition. Additionally, standard methods of functional assignment involve sequence alignment to a gene function often fail to find the significant matches. This paper proposes a framework of machine learning method in predicting protein function irrespective of sequence similarity. The framework aims to provide a workflow on predicting protein function that combines both data mining and machine learning algorithms. Three main components are involved: pre-processing, model development and testing & evaluation. The study is expected to create a new method on feature selection processes towards predicting protein functional classes in addition to complementing the existing conventional method of functional assignment.

AB - Determining the functions of uncharacterized proteins from sequences remains a challenge despite the growth of the number of prediction methods. This is due to the nature of the inherent limitations of current tools and databases and the ambiguity of the function definition. Additionally, standard methods of functional assignment involve sequence alignment to a gene function often fail to find the significant matches. This paper proposes a framework of machine learning method in predicting protein function irrespective of sequence similarity. The framework aims to provide a workflow on predicting protein function that combines both data mining and machine learning algorithms. Three main components are involved: pre-processing, model development and testing & evaluation. The study is expected to create a new method on feature selection processes towards predicting protein functional classes in addition to complementing the existing conventional method of functional assignment.

UR - http://www.scopus.com/inward/record.url?scp=57349104125&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57349104125&partnerID=8YFLogxK

U2 - 10.1109/ITSIM.2008.4631683

DO - 10.1109/ITSIM.2008.4631683

M3 - Conference contribution

AN - SCOPUS:57349104125

SN - 9781424423286

VL - 2

BT - Proceedings - International Symposium on Information Technology 2008, ITSim

ER -