Retrieval performance using different type of similarity coefficient for virtual screening

Shereena Arif, Noor Zeemah Shamsheh Khan, Nurul Malim, Suhaila Zainudin

Research output: Contribution to journalArticle

Abstract

Development of a new drug needs chemical databases as references to find lead compounds. This study aims to determine the best similarity coefficient to be used for virtual screening task using chemical databases. We calculated the structural resemblance between each pair of chemical structures in their own activity class to get the Mean Pairwise Similarity (MPS) value to see the nature of heterogeneity for each natural product and synthetic chemical databases. The process involves the 2D descriptor of type ECFC4 fingerprint to represent each structure and Tanimoto coefficient to calculate the similarity score between each pair of chemical structures in the same activity class. MPS for an activity class was obtained by taking the average of all similarity scores within that class. Next, three types of similarity coefficients have been used to calculate the similarity score between a query structure and each of the database structure. The results indicate that Tanimoto coefficient shows better performance compared to Russell Rao and Forbes in retrieval task using chemical database. This implies that Tanimoto coefficient is recommended to carry out virtual screening in drug development. More work should be carried out to determine the best combination of similarity coefficient and fingerprint type to get optimal retrieval performance.

Original languageEnglish
Pages (from-to)391-395
Number of pages5
JournalResearch Journal of Applied Sciences, Engineering and Technology
Volume9
Issue number5
Publication statusPublished - 2015

Fingerprint

Screening
Lead compounds

Keywords

  • Chemoinformatics
  • Mean pairwise similarity
  • Retrieval
  • Similarity search
  • Virtual screening

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science(all)

Cite this

Retrieval performance using different type of similarity coefficient for virtual screening. / Arif, Shereena; Khan, Noor Zeemah Shamsheh; Malim, Nurul; Zainudin, Suhaila.

In: Research Journal of Applied Sciences, Engineering and Technology, Vol. 9, No. 5, 2015, p. 391-395.

Research output: Contribution to journalArticle

@article{f9772a11c7bb41eb84df1d6e47a215af,
title = "Retrieval performance using different type of similarity coefficient for virtual screening",
abstract = "Development of a new drug needs chemical databases as references to find lead compounds. This study aims to determine the best similarity coefficient to be used for virtual screening task using chemical databases. We calculated the structural resemblance between each pair of chemical structures in their own activity class to get the Mean Pairwise Similarity (MPS) value to see the nature of heterogeneity for each natural product and synthetic chemical databases. The process involves the 2D descriptor of type ECFC4 fingerprint to represent each structure and Tanimoto coefficient to calculate the similarity score between each pair of chemical structures in the same activity class. MPS for an activity class was obtained by taking the average of all similarity scores within that class. Next, three types of similarity coefficients have been used to calculate the similarity score between a query structure and each of the database structure. The results indicate that Tanimoto coefficient shows better performance compared to Russell Rao and Forbes in retrieval task using chemical database. This implies that Tanimoto coefficient is recommended to carry out virtual screening in drug development. More work should be carried out to determine the best combination of similarity coefficient and fingerprint type to get optimal retrieval performance.",
keywords = "Chemoinformatics, Mean pairwise similarity, Retrieval, Similarity search, Virtual screening",
author = "Shereena Arif and Khan, {Noor Zeemah Shamsheh} and Nurul Malim and Suhaila Zainudin",
year = "2015",
language = "English",
volume = "9",
pages = "391--395",
journal = "Research Journal of Applied Sciences, Engineering and Technology",
issn = "2040-7459",
publisher = "Maxwell Scientific Publications",
number = "5",

}

TY - JOUR

T1 - Retrieval performance using different type of similarity coefficient for virtual screening

AU - Arif, Shereena

AU - Khan, Noor Zeemah Shamsheh

AU - Malim, Nurul

AU - Zainudin, Suhaila

PY - 2015

Y1 - 2015

N2 - Development of a new drug needs chemical databases as references to find lead compounds. This study aims to determine the best similarity coefficient to be used for virtual screening task using chemical databases. We calculated the structural resemblance between each pair of chemical structures in their own activity class to get the Mean Pairwise Similarity (MPS) value to see the nature of heterogeneity for each natural product and synthetic chemical databases. The process involves the 2D descriptor of type ECFC4 fingerprint to represent each structure and Tanimoto coefficient to calculate the similarity score between each pair of chemical structures in the same activity class. MPS for an activity class was obtained by taking the average of all similarity scores within that class. Next, three types of similarity coefficients have been used to calculate the similarity score between a query structure and each of the database structure. The results indicate that Tanimoto coefficient shows better performance compared to Russell Rao and Forbes in retrieval task using chemical database. This implies that Tanimoto coefficient is recommended to carry out virtual screening in drug development. More work should be carried out to determine the best combination of similarity coefficient and fingerprint type to get optimal retrieval performance.

AB - Development of a new drug needs chemical databases as references to find lead compounds. This study aims to determine the best similarity coefficient to be used for virtual screening task using chemical databases. We calculated the structural resemblance between each pair of chemical structures in their own activity class to get the Mean Pairwise Similarity (MPS) value to see the nature of heterogeneity for each natural product and synthetic chemical databases. The process involves the 2D descriptor of type ECFC4 fingerprint to represent each structure and Tanimoto coefficient to calculate the similarity score between each pair of chemical structures in the same activity class. MPS for an activity class was obtained by taking the average of all similarity scores within that class. Next, three types of similarity coefficients have been used to calculate the similarity score between a query structure and each of the database structure. The results indicate that Tanimoto coefficient shows better performance compared to Russell Rao and Forbes in retrieval task using chemical database. This implies that Tanimoto coefficient is recommended to carry out virtual screening in drug development. More work should be carried out to determine the best combination of similarity coefficient and fingerprint type to get optimal retrieval performance.

KW - Chemoinformatics

KW - Mean pairwise similarity

KW - Retrieval

KW - Similarity search

KW - Virtual screening

UR - http://www.scopus.com/inward/record.url?scp=84926453879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84926453879&partnerID=8YFLogxK

M3 - Article

VL - 9

SP - 391

EP - 395

JO - Research Journal of Applied Sciences, Engineering and Technology

JF - Research Journal of Applied Sciences, Engineering and Technology

SN - 2040-7459

IS - 5

ER -