Semantically enhanced pseudo relevance feedback for Arabic information retrieval

Jaffar Atwan, Masnizah Mohd, Hasan Rashaideh, Ghassan Kanaan

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion levels. We also adopted semantic information for the Pseudo Relevance Feedback. The enhanced Arabic IR framework was built and evaluated using TREC 2001 data. The technique of using the Arabic WordNet to build a semantic relationship between query and corpus in two levels, that is, the corpus and query levels, is a new one. The enhanced AIR framework demonstrated an improvement by 49% in terms of mean average precision, with an increase of 7.3% in recall compared with the baseline framework.

Original languageEnglish
Pages (from-to)246-260
Number of pages15
JournalJournal of Information Science
Volume42
Issue number2
DOIs
Publication statusPublished - 1 Apr 2016
Externally publishedYes

Fingerprint

Information retrieval
information retrieval
Feedback
Semantics
semantics
Processing
indexing

Keywords

  • Arabic
  • information retrieval
  • pseudo relevance feedback
  • query expansion
  • semantic

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Cite this

Semantically enhanced pseudo relevance feedback for Arabic information retrieval. / Atwan, Jaffar; Mohd, Masnizah; Rashaideh, Hasan; Kanaan, Ghassan.

In: Journal of Information Science, Vol. 42, No. 2, 01.04.2016, p. 246-260.

Research output: Contribution to journalArticle

Atwan, Jaffar ; Mohd, Masnizah ; Rashaideh, Hasan ; Kanaan, Ghassan. / Semantically enhanced pseudo relevance feedback for Arabic information retrieval. In: Journal of Information Science. 2016 ; Vol. 42, No. 2. pp. 246-260.
@article{4f080188064740998520ea5b28949075,
title = "Semantically enhanced pseudo relevance feedback for Arabic information retrieval",
abstract = "The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion levels. We also adopted semantic information for the Pseudo Relevance Feedback. The enhanced Arabic IR framework was built and evaluated using TREC 2001 data. The technique of using the Arabic WordNet to build a semantic relationship between query and corpus in two levels, that is, the corpus and query levels, is a new one. The enhanced AIR framework demonstrated an improvement by 49{\%} in terms of mean average precision, with an increase of 7.3{\%} in recall compared with the baseline framework.",
keywords = "Arabic, information retrieval, pseudo relevance feedback, query expansion, semantic",
author = "Jaffar Atwan and Masnizah Mohd and Hasan Rashaideh and Ghassan Kanaan",
year = "2016",
month = "4",
day = "1",
doi = "10.1177/0165551515594722",
language = "English",
volume = "42",
pages = "246--260",
journal = "Journal of Information Science",
issn = "0165-5515",
publisher = "SAGE Publications Ltd",
number = "2",

}

TY - JOUR

T1 - Semantically enhanced pseudo relevance feedback for Arabic information retrieval

AU - Atwan, Jaffar

AU - Mohd, Masnizah

AU - Rashaideh, Hasan

AU - Kanaan, Ghassan

PY - 2016/4/1

Y1 - 2016/4/1

N2 - The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion levels. We also adopted semantic information for the Pseudo Relevance Feedback. The enhanced Arabic IR framework was built and evaluated using TREC 2001 data. The technique of using the Arabic WordNet to build a semantic relationship between query and corpus in two levels, that is, the corpus and query levels, is a new one. The enhanced AIR framework demonstrated an improvement by 49% in terms of mean average precision, with an increase of 7.3% in recall compared with the baseline framework.

AB - The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion levels. We also adopted semantic information for the Pseudo Relevance Feedback. The enhanced Arabic IR framework was built and evaluated using TREC 2001 data. The technique of using the Arabic WordNet to build a semantic relationship between query and corpus in two levels, that is, the corpus and query levels, is a new one. The enhanced AIR framework demonstrated an improvement by 49% in terms of mean average precision, with an increase of 7.3% in recall compared with the baseline framework.

KW - Arabic

KW - information retrieval

KW - pseudo relevance feedback

KW - query expansion

KW - semantic

UR - http://www.scopus.com/inward/record.url?scp=84959360602&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959360602&partnerID=8YFLogxK

U2 - 10.1177/0165551515594722

DO - 10.1177/0165551515594722

M3 - Article

AN - SCOPUS:84959360602

VL - 42

SP - 246

EP - 260

JO - Journal of Information Science

JF - Journal of Information Science

SN - 0165-5515

IS - 2

ER -