Measuring the representativeness of index terms in literary texts

An experiment on the Quran

Hayati Abd Rahman, Shahrul Azman Mohd Noah, Hector Jimenez-Salazar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. One of the important tasks in the construction of concept hierarchy is the identification of suitable terms with appropriate size of domain vocabulary. One way of achieving such a size is by using term reduction. The aim of this paper is to examine the effectiveness of the reduction approach to reduce the size of vocabulary using term selection methods. An experiment has been conducted on the Quran which is assumed to be a literary text. The experiment compares the entropy method, the transition point method and the hybrid of transition point and entropy methods with the Vector Space Model (VSM). Results indicate the effectiveness of the Transition Point method as compared to the others in reducing the size of the vocabulary but at the same time preserve those important terms that exist in the literary documents.

Original languageEnglish
Title of host publicationProceedings - International Symposium on Information Technology 2008, ITSim
Volume2
DOIs
Publication statusPublished - 2008
EventInternational Symposium on Information Technology 2008, ITSim - Kuala Lumpur
Duration: 26 Aug 200829 Aug 2008

Other

OtherInternational Symposium on Information Technology 2008, ITSim
CityKuala Lumpur
Period26/8/0829/8/08

Fingerprint

Entropy
Vector spaces
Information retrieval
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Rahman, H. A., Mohd Noah, S. A., & Jimenez-Salazar, H. (2008). Measuring the representativeness of index terms in literary texts: An experiment on the Quran. In Proceedings - International Symposium on Information Technology 2008, ITSim (Vol. 2). [4631699] https://doi.org/10.1109/ITSIM.2008.4631699

Measuring the representativeness of index terms in literary texts : An experiment on the Quran. / Rahman, Hayati Abd; Mohd Noah, Shahrul Azman; Jimenez-Salazar, Hector.

Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008. 4631699.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rahman, HA, Mohd Noah, SA & Jimenez-Salazar, H 2008, Measuring the representativeness of index terms in literary texts: An experiment on the Quran. in Proceedings - International Symposium on Information Technology 2008, ITSim. vol. 2, 4631699, International Symposium on Information Technology 2008, ITSim, Kuala Lumpur, 26/8/08. https://doi.org/10.1109/ITSIM.2008.4631699
Rahman HA, Mohd Noah SA, Jimenez-Salazar H. Measuring the representativeness of index terms in literary texts: An experiment on the Quran. In Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2. 2008. 4631699 https://doi.org/10.1109/ITSIM.2008.4631699
Rahman, Hayati Abd ; Mohd Noah, Shahrul Azman ; Jimenez-Salazar, Hector. / Measuring the representativeness of index terms in literary texts : An experiment on the Quran. Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008.
@inproceedings{d746c8b20ffd4f69a7ae4b4c15a4ff34,
title = "Measuring the representativeness of index terms in literary texts: An experiment on the Quran",
abstract = "Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. One of the important tasks in the construction of concept hierarchy is the identification of suitable terms with appropriate size of domain vocabulary. One way of achieving such a size is by using term reduction. The aim of this paper is to examine the effectiveness of the reduction approach to reduce the size of vocabulary using term selection methods. An experiment has been conducted on the Quran which is assumed to be a literary text. The experiment compares the entropy method, the transition point method and the hybrid of transition point and entropy methods with the Vector Space Model (VSM). Results indicate the effectiveness of the Transition Point method as compared to the others in reducing the size of the vocabulary but at the same time preserve those important terms that exist in the literary documents.",
author = "Rahman, {Hayati Abd} and {Mohd Noah}, {Shahrul Azman} and Hector Jimenez-Salazar",
year = "2008",
doi = "10.1109/ITSIM.2008.4631699",
language = "English",
isbn = "9781424423286",
volume = "2",
booktitle = "Proceedings - International Symposium on Information Technology 2008, ITSim",

}

TY - GEN

T1 - Measuring the representativeness of index terms in literary texts

T2 - An experiment on the Quran

AU - Rahman, Hayati Abd

AU - Mohd Noah, Shahrul Azman

AU - Jimenez-Salazar, Hector

PY - 2008

Y1 - 2008

N2 - Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. One of the important tasks in the construction of concept hierarchy is the identification of suitable terms with appropriate size of domain vocabulary. One way of achieving such a size is by using term reduction. The aim of this paper is to examine the effectiveness of the reduction approach to reduce the size of vocabulary using term selection methods. An experiment has been conducted on the Quran which is assumed to be a literary text. The experiment compares the entropy method, the transition point method and the hybrid of transition point and entropy methods with the Vector Space Model (VSM). Results indicate the effectiveness of the Transition Point method as compared to the others in reducing the size of the vocabulary but at the same time preserve those important terms that exist in the literary documents.

AB - Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. One of the important tasks in the construction of concept hierarchy is the identification of suitable terms with appropriate size of domain vocabulary. One way of achieving such a size is by using term reduction. The aim of this paper is to examine the effectiveness of the reduction approach to reduce the size of vocabulary using term selection methods. An experiment has been conducted on the Quran which is assumed to be a literary text. The experiment compares the entropy method, the transition point method and the hybrid of transition point and entropy methods with the Vector Space Model (VSM). Results indicate the effectiveness of the Transition Point method as compared to the others in reducing the size of the vocabulary but at the same time preserve those important terms that exist in the literary documents.

UR - http://www.scopus.com/inward/record.url?scp=57349110054&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57349110054&partnerID=8YFLogxK

U2 - 10.1109/ITSIM.2008.4631699

DO - 10.1109/ITSIM.2008.4631699

M3 - Conference contribution

SN - 9781424423286

VL - 2

BT - Proceedings - International Symposium on Information Technology 2008, ITSim

ER -