Inverse frequency weighting of fragments for similarity-based virtual screening

Shereena M. Arif, John D. Holliday, Peter Willett

    Research output: Contribution to journalArticle

    11 Citations (Scopus)

    Abstract

    This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.

    Original languageEnglish
    Pages (from-to)1340-1349
    Number of pages10
    JournalJournal of Chemical Information and Modeling
    Volume50
    Issue number8
    DOIs
    Publication statusPublished - 23 Aug 2010

    Fingerprint

    weighting
    Screening
    Bioactivity
    drug
    Molecules
    experiment
    Pharmaceutical Preparations
    performance
    Experiments

    ASJC Scopus subject areas

    • Chemistry(all)
    • Chemical Engineering(all)
    • Computer Science Applications
    • Library and Information Sciences

    Cite this

    Inverse frequency weighting of fragments for similarity-based virtual screening. / Arif, Shereena M.; Holliday, John D.; Willett, Peter.

    In: Journal of Chemical Information and Modeling, Vol. 50, No. 8, 23.08.2010, p. 1340-1349.

    Research output: Contribution to journalArticle

    Arif, Shereena M. ; Holliday, John D. ; Willett, Peter. / Inverse frequency weighting of fragments for similarity-based virtual screening. In: Journal of Chemical Information and Modeling. 2010 ; Vol. 50, No. 8. pp. 1340-1349.
    @article{a89839ef4da447c2af86e8e4cb6f2128,
    title = "Inverse frequency weighting of fragments for similarity-based virtual screening",
    abstract = "This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.",
    author = "Arif, {Shereena M.} and Holliday, {John D.} and Peter Willett",
    year = "2010",
    month = "8",
    day = "23",
    doi = "10.1021/ci1001235",
    language = "English",
    volume = "50",
    pages = "1340--1349",
    journal = "Journal of Chemical Information and Computer Sciences",
    issn = "0095-2338",
    publisher = "American Chemical Society",
    number = "8",

    }

    TY - JOUR

    T1 - Inverse frequency weighting of fragments for similarity-based virtual screening

    AU - Arif, Shereena M.

    AU - Holliday, John D.

    AU - Willett, Peter

    PY - 2010/8/23

    Y1 - 2010/8/23

    N2 - This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.

    AB - This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.

    UR - http://www.scopus.com/inward/record.url?scp=77956055372&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=77956055372&partnerID=8YFLogxK

    U2 - 10.1021/ci1001235

    DO - 10.1021/ci1001235

    M3 - Article

    VL - 50

    SP - 1340

    EP - 1349

    JO - Journal of Chemical Information and Computer Sciences

    JF - Journal of Chemical Information and Computer Sciences

    SN - 0095-2338

    IS - 8

    ER -