A rule and template based stemming algorithm for Arabic language

Tengku Mohd T Sembok, Belal Mustafa Abu Ata, Zainab Abu Bakar

    Research output: Contribution to journalArticle

    4 Citations (Scopus)

    Abstract

    Stemming is defined as the conflation of all variations of specific words to a single form called the root or stem. Stemming plays a vital role in natural language processing and understanding. As in other languages, there is a need for an effective stemming algorithm for Arabic words. Arabic is a language having a rich and complex morphological word structures and rules. An Arabic stemming algorithm based on morphological rules has been developed, and to enhance its effectiveness, a dictionary of root words is used to determine the right stems. The Arabic stemming algorithm developed by Al-Omari is studied and a new algorithm is proposed to enhance the performance. The improvements obtained relate to the order in which the dictionary is looked-up and the order in which the morphological rules are applied.

    Original languageEnglish
    Pages (from-to)974-981
    Number of pages8
    JournalInternational Journal of Mathematical Models and Methods in Applied Sciences
    Volume5
    Issue number5
    Publication statusPublished - 2011

    Fingerprint

    Template
    Glossaries
    Roots
    Natural Language
    Language
    Processing
    Dictionary

    Keywords

    • Indexing
    • Information retrieval
    • Natural language processing
    • Stemming

    ASJC Scopus subject areas

    • Applied Mathematics
    • Computational Mathematics
    • Mathematical Physics
    • Modelling and Simulation

    Cite this

    A rule and template based stemming algorithm for Arabic language. / Sembok, Tengku Mohd T; Ata, Belal Mustafa Abu; Bakar, Zainab Abu.

    In: International Journal of Mathematical Models and Methods in Applied Sciences, Vol. 5, No. 5, 2011, p. 974-981.

    Research output: Contribution to journalArticle

    Sembok, Tengku Mohd T ; Ata, Belal Mustafa Abu ; Bakar, Zainab Abu. / A rule and template based stemming algorithm for Arabic language. In: International Journal of Mathematical Models and Methods in Applied Sciences. 2011 ; Vol. 5, No. 5. pp. 974-981.
    @article{712b849f29b247d6ad46994737422c62,
    title = "A rule and template based stemming algorithm for Arabic language",
    abstract = "Stemming is defined as the conflation of all variations of specific words to a single form called the root or stem. Stemming plays a vital role in natural language processing and understanding. As in other languages, there is a need for an effective stemming algorithm for Arabic words. Arabic is a language having a rich and complex morphological word structures and rules. An Arabic stemming algorithm based on morphological rules has been developed, and to enhance its effectiveness, a dictionary of root words is used to determine the right stems. The Arabic stemming algorithm developed by Al-Omari is studied and a new algorithm is proposed to enhance the performance. The improvements obtained relate to the order in which the dictionary is looked-up and the order in which the morphological rules are applied.",
    keywords = "Indexing, Information retrieval, Natural language processing, Stemming",
    author = "Sembok, {Tengku Mohd T} and Ata, {Belal Mustafa Abu} and Bakar, {Zainab Abu}",
    year = "2011",
    language = "English",
    volume = "5",
    pages = "974--981",
    journal = "International Journal of Mathematical Models and Methods in Applied Sciences",
    issn = "1998-0140",
    publisher = "North Atlantic University Union NAUN",
    number = "5",

    }

    TY - JOUR

    T1 - A rule and template based stemming algorithm for Arabic language

    AU - Sembok, Tengku Mohd T

    AU - Ata, Belal Mustafa Abu

    AU - Bakar, Zainab Abu

    PY - 2011

    Y1 - 2011

    N2 - Stemming is defined as the conflation of all variations of specific words to a single form called the root or stem. Stemming plays a vital role in natural language processing and understanding. As in other languages, there is a need for an effective stemming algorithm for Arabic words. Arabic is a language having a rich and complex morphological word structures and rules. An Arabic stemming algorithm based on morphological rules has been developed, and to enhance its effectiveness, a dictionary of root words is used to determine the right stems. The Arabic stemming algorithm developed by Al-Omari is studied and a new algorithm is proposed to enhance the performance. The improvements obtained relate to the order in which the dictionary is looked-up and the order in which the morphological rules are applied.

    AB - Stemming is defined as the conflation of all variations of specific words to a single form called the root or stem. Stemming plays a vital role in natural language processing and understanding. As in other languages, there is a need for an effective stemming algorithm for Arabic words. Arabic is a language having a rich and complex morphological word structures and rules. An Arabic stemming algorithm based on morphological rules has been developed, and to enhance its effectiveness, a dictionary of root words is used to determine the right stems. The Arabic stemming algorithm developed by Al-Omari is studied and a new algorithm is proposed to enhance the performance. The improvements obtained relate to the order in which the dictionary is looked-up and the order in which the morphological rules are applied.

    KW - Indexing

    KW - Information retrieval

    KW - Natural language processing

    KW - Stemming

    UR - http://www.scopus.com/inward/record.url?scp=79960365881&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=79960365881&partnerID=8YFLogxK

    M3 - Article

    VL - 5

    SP - 974

    EP - 981

    JO - International Journal of Mathematical Models and Methods in Applied Sciences

    JF - International Journal of Mathematical Models and Methods in Applied Sciences

    SN - 1998-0140

    IS - 5

    ER -