Arabic word stemming algorithms and retrieval effectiveness

Tengku Mohd T Sembok, Belal Abu Ata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Documents retrieval in Information Retrieval Systems (IRS) is generally about retrieving of relevant documents pertaining to information needs. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS applies algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieval process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Arabic stemming and incorporated it in our experimental system in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used.

Original languageEnglish
Title of host publicationLecture Notes in Engineering and Computer Science
Pages1577-1582
Number of pages6
Volume3 LNECS
Publication statusPublished - 2013
Externally publishedYes
Event2013 World Congress on Engineering, WCE 2013 - London
Duration: 3 Jul 20135 Jul 2013

Other

Other2013 World Congress on Engineering, WCE 2013
CityLondon
Period3/7/135/7/13

Fingerprint

Information retrieval systems
Semantics
Syntactics
Vector spaces
Processing

Keywords

  • Artificial intelligence
  • Information retrieval
  • Natural language processing

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

Sembok, T. M. T., & Ata, B. A. (2013). Arabic word stemming algorithms and retrieval effectiveness. In Lecture Notes in Engineering and Computer Science (Vol. 3 LNECS, pp. 1577-1582)

Arabic word stemming algorithms and retrieval effectiveness. / Sembok, Tengku Mohd T; Ata, Belal Abu.

Lecture Notes in Engineering and Computer Science. Vol. 3 LNECS 2013. p. 1577-1582.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sembok, TMT & Ata, BA 2013, Arabic word stemming algorithms and retrieval effectiveness. in Lecture Notes in Engineering and Computer Science. vol. 3 LNECS, pp. 1577-1582, 2013 World Congress on Engineering, WCE 2013, London, 3/7/13.
Sembok TMT, Ata BA. Arabic word stemming algorithms and retrieval effectiveness. In Lecture Notes in Engineering and Computer Science. Vol. 3 LNECS. 2013. p. 1577-1582
Sembok, Tengku Mohd T ; Ata, Belal Abu. / Arabic word stemming algorithms and retrieval effectiveness. Lecture Notes in Engineering and Computer Science. Vol. 3 LNECS 2013. pp. 1577-1582
@inproceedings{ad3e85ef650c44a3b0add715681706b5,
title = "Arabic word stemming algorithms and retrieval effectiveness",
abstract = "Documents retrieval in Information Retrieval Systems (IRS) is generally about retrieving of relevant documents pertaining to information needs. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS applies algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieval process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Arabic stemming and incorporated it in our experimental system in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used.",
keywords = "Artificial intelligence, Information retrieval, Natural language processing",
author = "Sembok, {Tengku Mohd T} and Ata, {Belal Abu}",
year = "2013",
language = "English",
isbn = "9789881925299",
volume = "3 LNECS",
pages = "1577--1582",
booktitle = "Lecture Notes in Engineering and Computer Science",

}

TY - GEN

T1 - Arabic word stemming algorithms and retrieval effectiveness

AU - Sembok, Tengku Mohd T

AU - Ata, Belal Abu

PY - 2013

Y1 - 2013

N2 - Documents retrieval in Information Retrieval Systems (IRS) is generally about retrieving of relevant documents pertaining to information needs. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS applies algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieval process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Arabic stemming and incorporated it in our experimental system in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used.

AB - Documents retrieval in Information Retrieval Systems (IRS) is generally about retrieving of relevant documents pertaining to information needs. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS applies algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieval process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Arabic stemming and incorporated it in our experimental system in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used.

KW - Artificial intelligence

KW - Information retrieval

KW - Natural language processing

UR - http://www.scopus.com/inward/record.url?scp=84887864770&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84887864770&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9789881925299

VL - 3 LNECS

SP - 1577

EP - 1582

BT - Lecture Notes in Engineering and Computer Science

ER -