Pseudo relevance feedback technique and semantic similarity for corpus-based expansion

Masnizah Mohd, Jaffar Atwan, Kiyoaki Shirai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The adaptation of a Query Expansion (QE) approach for Arabic documents may produce the worst rankings or irrelevant results. Therefore, we have introduced a technique, which is to utilise the Arabic WordNet in the corpus and query expansion level. A Point-wise Mutual Information (PMI) corpus-based measure is used to semantically select synonyms from the WordNet. In addition, Automatic Query Expansion (AQE) and Pseudo Relevance Feedback (PRF) methods were also explored to improve the performance of the Arabic information retrieval (AIR) system. The experimental results of our proposed techniques for AIR shows that the use of Arabic WordNet in the corpus and query level together with AQE, and the adaptation of PMI in the expansion process have successfully reduced the level of ambiguity as these techniques select the most appropriate synonym. It enhanced knowledge discovery by taking care of the relevancy aspect. The techniques also demonstrated an improvement in Mean Average Precision by 49%, with an increase of 7.3% in recall in.

Original languageEnglish
Title of host publicationKDIR
PublisherSciTePress
Pages445-450
Number of pages6
Volume1
ISBN (Print)9789897581588
Publication statusPublished - 2015
Externally publishedYes
Event7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2015 - Lisbon, Portugal
Duration: 12 Nov 201514 Nov 2015

Other

Other7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2015
CountryPortugal
CityLisbon
Period12/11/1514/11/15

Fingerprint

Semantics
Feedback
Information retrieval systems
Information retrieval
Data mining

Keywords

  • Arabic
  • Information retrieval
  • Pseudo relevance feedback
  • Query expansion
  • Semantic

ASJC Scopus subject areas

  • Software

Cite this

Mohd, M., Atwan, J., & Shirai, K. (2015). Pseudo relevance feedback technique and semantic similarity for corpus-based expansion. In KDIR (Vol. 1, pp. 445-450). SciTePress.

Pseudo relevance feedback technique and semantic similarity for corpus-based expansion. / Mohd, Masnizah; Atwan, Jaffar; Shirai, Kiyoaki.

KDIR. Vol. 1 SciTePress, 2015. p. 445-450.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mohd, M, Atwan, J & Shirai, K 2015, Pseudo relevance feedback technique and semantic similarity for corpus-based expansion. in KDIR. vol. 1, SciTePress, pp. 445-450, 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2015, Lisbon, Portugal, 12/11/15.
Mohd M, Atwan J, Shirai K. Pseudo relevance feedback technique and semantic similarity for corpus-based expansion. In KDIR. Vol. 1. SciTePress. 2015. p. 445-450
Mohd, Masnizah ; Atwan, Jaffar ; Shirai, Kiyoaki. / Pseudo relevance feedback technique and semantic similarity for corpus-based expansion. KDIR. Vol. 1 SciTePress, 2015. pp. 445-450
@inproceedings{04807187addf4fa98aaa3a531f9e214c,
title = "Pseudo relevance feedback technique and semantic similarity for corpus-based expansion",
abstract = "The adaptation of a Query Expansion (QE) approach for Arabic documents may produce the worst rankings or irrelevant results. Therefore, we have introduced a technique, which is to utilise the Arabic WordNet in the corpus and query expansion level. A Point-wise Mutual Information (PMI) corpus-based measure is used to semantically select synonyms from the WordNet. In addition, Automatic Query Expansion (AQE) and Pseudo Relevance Feedback (PRF) methods were also explored to improve the performance of the Arabic information retrieval (AIR) system. The experimental results of our proposed techniques for AIR shows that the use of Arabic WordNet in the corpus and query level together with AQE, and the adaptation of PMI in the expansion process have successfully reduced the level of ambiguity as these techniques select the most appropriate synonym. It enhanced knowledge discovery by taking care of the relevancy aspect. The techniques also demonstrated an improvement in Mean Average Precision by 49{\%}, with an increase of 7.3{\%} in recall in.",
keywords = "Arabic, Information retrieval, Pseudo relevance feedback, Query expansion, Semantic",
author = "Masnizah Mohd and Jaffar Atwan and Kiyoaki Shirai",
year = "2015",
language = "English",
isbn = "9789897581588",
volume = "1",
pages = "445--450",
booktitle = "KDIR",
publisher = "SciTePress",

}

TY - GEN

T1 - Pseudo relevance feedback technique and semantic similarity for corpus-based expansion

AU - Mohd, Masnizah

AU - Atwan, Jaffar

AU - Shirai, Kiyoaki

PY - 2015

Y1 - 2015

N2 - The adaptation of a Query Expansion (QE) approach for Arabic documents may produce the worst rankings or irrelevant results. Therefore, we have introduced a technique, which is to utilise the Arabic WordNet in the corpus and query expansion level. A Point-wise Mutual Information (PMI) corpus-based measure is used to semantically select synonyms from the WordNet. In addition, Automatic Query Expansion (AQE) and Pseudo Relevance Feedback (PRF) methods were also explored to improve the performance of the Arabic information retrieval (AIR) system. The experimental results of our proposed techniques for AIR shows that the use of Arabic WordNet in the corpus and query level together with AQE, and the adaptation of PMI in the expansion process have successfully reduced the level of ambiguity as these techniques select the most appropriate synonym. It enhanced knowledge discovery by taking care of the relevancy aspect. The techniques also demonstrated an improvement in Mean Average Precision by 49%, with an increase of 7.3% in recall in.

AB - The adaptation of a Query Expansion (QE) approach for Arabic documents may produce the worst rankings or irrelevant results. Therefore, we have introduced a technique, which is to utilise the Arabic WordNet in the corpus and query expansion level. A Point-wise Mutual Information (PMI) corpus-based measure is used to semantically select synonyms from the WordNet. In addition, Automatic Query Expansion (AQE) and Pseudo Relevance Feedback (PRF) methods were also explored to improve the performance of the Arabic information retrieval (AIR) system. The experimental results of our proposed techniques for AIR shows that the use of Arabic WordNet in the corpus and query level together with AQE, and the adaptation of PMI in the expansion process have successfully reduced the level of ambiguity as these techniques select the most appropriate synonym. It enhanced knowledge discovery by taking care of the relevancy aspect. The techniques also demonstrated an improvement in Mean Average Precision by 49%, with an increase of 7.3% in recall in.

KW - Arabic

KW - Information retrieval

KW - Pseudo relevance feedback

KW - Query expansion

KW - Semantic

UR - http://www.scopus.com/inward/record.url?scp=84960919212&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960919212&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9789897581588

VL - 1

SP - 445

EP - 450

BT - KDIR

PB - SciTePress

ER -