Proteins of unknown function in the protein data bank (PDB): An inventory of true uncharacterized proteins and computational tools for their analysis

Nurul Nadzirin, Mohd Firdaus Mohd Raih

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.

Original languageEnglish
Pages (from-to)12761-12772
Number of pages12
JournalInternational Journal of Molecular Sciences
Volume13
Issue number10
DOIs
Publication statusPublished - Oct 2012

Fingerprint

Databases
proteins
Proteins
Equipment and Supplies
entry
annotations
Sequence Homology
files
inference
availability
Availability
Experiments

Keywords

  • 3D motifs
  • Protein data bank
  • Proteins of uncharacterized function
  • Proteins of unknown function
  • Structural similarity

ASJC Scopus subject areas

  • Computer Science Applications
  • Molecular Biology
  • Catalysis
  • Inorganic Chemistry
  • Spectroscopy
  • Organic Chemistry
  • Physical and Theoretical Chemistry

Cite this

@article{9dcb88e70a2146c8814268c5df84c298,
title = "Proteins of unknown function in the protein data bank (PDB): An inventory of true uncharacterized proteins and computational tools for their analysis",
abstract = "Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53{\%} of PDB entries (1084 coordinate files) that were categorized under {"}unknown function{"} are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.",
keywords = "3D motifs, Protein data bank, Proteins of uncharacterized function, Proteins of unknown function, Structural similarity",
author = "Nurul Nadzirin and {Mohd Raih}, {Mohd Firdaus}",
year = "2012",
month = "10",
doi = "10.3390/ijms131012761",
language = "English",
volume = "13",
pages = "12761--12772",
journal = "International Journal of Molecular Sciences",
issn = "1661-6596",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "10",

}

TY - JOUR

T1 - Proteins of unknown function in the protein data bank (PDB)

T2 - An inventory of true uncharacterized proteins and computational tools for their analysis

AU - Nadzirin, Nurul

AU - Mohd Raih, Mohd Firdaus

PY - 2012/10

Y1 - 2012/10

N2 - Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.

AB - Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.

KW - 3D motifs

KW - Protein data bank

KW - Proteins of uncharacterized function

KW - Proteins of unknown function

KW - Structural similarity

UR - http://www.scopus.com/inward/record.url?scp=84867749014&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867749014&partnerID=8YFLogxK

U2 - 10.3390/ijms131012761

DO - 10.3390/ijms131012761

M3 - Article

C2 - 23202924

AN - SCOPUS:84867749014

VL - 13

SP - 12761

EP - 12772

JO - International Journal of Molecular Sciences

JF - International Journal of Molecular Sciences

SN - 1661-6596

IS - 10

ER -