An adaptive updating topic specific web search system using T-Graph

Ahmed Patel

    Research output: Contribution to journalArticle

    4 Citations (Scopus)

    Abstract

    Problem statement: The main goal of a Web crawler is to collect documents that are relevant to a given topic in which the search engine specializes. These topic specific search systems typically take the whole document's content in predicting the importance of an unvisited link. But current research had proven that the document's content pointed to by an unvisited link is mainly dependent on the anchor text, which is more accurate than predicting it on the contents of the whole page. Approach: Between these two extremes, it was proposed that Treasure Graph, called T-Graph is a more effective way to guide the Web crawler to fetch topic specific documents predicted by identifying the topic boundary around the unvisited link and comparing that text with all the nodes of the T-Graph to obtain the matching node(s) and calculating the distance in the form of documents to be downloaded to reach the target documents. Results: Web search systems based on this strategy allowed crawlers and robots to update their experiences more rapidly and intelligently that can also offer speed of access and presentation advantages. Conclusion/Recommendations: The consequences of visiting a link to update a robot's experiences based on the principles and usage of T-Graph can be deployed as intelligent-knowledge Web crawlers as shown by the proposed novel Web search system architecture.

    Original languageEnglish
    Pages (from-to)450-456
    Number of pages7
    JournalJournal of Computer Science
    Volume6
    Issue number4
    Publication statusPublished - 2010

    Fingerprint

    Robots
    Search engines
    Anchors
    Web crawler

    Keywords

    • DDC
    • T-graph
    • Topic specific search engines
    • Web crawling
    • Web robot

    ASJC Scopus subject areas

    • Software
    • Computer Networks and Communications
    • Artificial Intelligence

    Cite this

    An adaptive updating topic specific web search system using T-Graph. / Patel, Ahmed.

    In: Journal of Computer Science, Vol. 6, No. 4, 2010, p. 450-456.

    Research output: Contribution to journalArticle

    @article{bfc72a8bd35b452eac194a3cb4e0a274,
    title = "An adaptive updating topic specific web search system using T-Graph",
    abstract = "Problem statement: The main goal of a Web crawler is to collect documents that are relevant to a given topic in which the search engine specializes. These topic specific search systems typically take the whole document's content in predicting the importance of an unvisited link. But current research had proven that the document's content pointed to by an unvisited link is mainly dependent on the anchor text, which is more accurate than predicting it on the contents of the whole page. Approach: Between these two extremes, it was proposed that Treasure Graph, called T-Graph is a more effective way to guide the Web crawler to fetch topic specific documents predicted by identifying the topic boundary around the unvisited link and comparing that text with all the nodes of the T-Graph to obtain the matching node(s) and calculating the distance in the form of documents to be downloaded to reach the target documents. Results: Web search systems based on this strategy allowed crawlers and robots to update their experiences more rapidly and intelligently that can also offer speed of access and presentation advantages. Conclusion/Recommendations: The consequences of visiting a link to update a robot's experiences based on the principles and usage of T-Graph can be deployed as intelligent-knowledge Web crawlers as shown by the proposed novel Web search system architecture.",
    keywords = "DDC, T-graph, Topic specific search engines, Web crawling, Web robot",
    author = "Ahmed Patel",
    year = "2010",
    language = "English",
    volume = "6",
    pages = "450--456",
    journal = "Journal of Computer Science",
    issn = "1549-3636",
    publisher = "Science Publications",
    number = "4",

    }

    TY - JOUR

    T1 - An adaptive updating topic specific web search system using T-Graph

    AU - Patel, Ahmed

    PY - 2010

    Y1 - 2010

    N2 - Problem statement: The main goal of a Web crawler is to collect documents that are relevant to a given topic in which the search engine specializes. These topic specific search systems typically take the whole document's content in predicting the importance of an unvisited link. But current research had proven that the document's content pointed to by an unvisited link is mainly dependent on the anchor text, which is more accurate than predicting it on the contents of the whole page. Approach: Between these two extremes, it was proposed that Treasure Graph, called T-Graph is a more effective way to guide the Web crawler to fetch topic specific documents predicted by identifying the topic boundary around the unvisited link and comparing that text with all the nodes of the T-Graph to obtain the matching node(s) and calculating the distance in the form of documents to be downloaded to reach the target documents. Results: Web search systems based on this strategy allowed crawlers and robots to update their experiences more rapidly and intelligently that can also offer speed of access and presentation advantages. Conclusion/Recommendations: The consequences of visiting a link to update a robot's experiences based on the principles and usage of T-Graph can be deployed as intelligent-knowledge Web crawlers as shown by the proposed novel Web search system architecture.

    AB - Problem statement: The main goal of a Web crawler is to collect documents that are relevant to a given topic in which the search engine specializes. These topic specific search systems typically take the whole document's content in predicting the importance of an unvisited link. But current research had proven that the document's content pointed to by an unvisited link is mainly dependent on the anchor text, which is more accurate than predicting it on the contents of the whole page. Approach: Between these two extremes, it was proposed that Treasure Graph, called T-Graph is a more effective way to guide the Web crawler to fetch topic specific documents predicted by identifying the topic boundary around the unvisited link and comparing that text with all the nodes of the T-Graph to obtain the matching node(s) and calculating the distance in the form of documents to be downloaded to reach the target documents. Results: Web search systems based on this strategy allowed crawlers and robots to update their experiences more rapidly and intelligently that can also offer speed of access and presentation advantages. Conclusion/Recommendations: The consequences of visiting a link to update a robot's experiences based on the principles and usage of T-Graph can be deployed as intelligent-knowledge Web crawlers as shown by the proposed novel Web search system architecture.

    KW - DDC

    KW - T-graph

    KW - Topic specific search engines

    KW - Web crawling

    KW - Web robot

    UR - http://www.scopus.com/inward/record.url?scp=77952514809&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=77952514809&partnerID=8YFLogxK

    M3 - Article

    VL - 6

    SP - 450

    EP - 456

    JO - Journal of Computer Science

    JF - Journal of Computer Science

    SN - 1549-3636

    IS - 4

    ER -