A comparative analysis of the entropy and transition point approach in representing index terms of literary text

Hayati Abd Rahman, Shahrul Azman Mohd Noah

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Problem statement: Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. Approach: One of the important tasks in construction of concept hierarchy is identification of suitable terms with appropriate size of domain vocabulary. Results: One way of achieving such a size is by using term reduction. The aim of this study is to examine the effectiveness of reduction approach to reduce size of vocabulary using term selection methods for literary text. The experiment compares entropy method, transition point method and hybrid of transition point and entropy methods with the Vector Space Model (VSM). Conclusion/Recommendations: Results indicate the effectiveness of Transition Point method as compared to the others in reducing size of vocabulary but at same time preserve those important terms that exist in the literary documents.

Original languageEnglish
Pages (from-to)1088-1093
Number of pages6
JournalJournal of Computer Science
Volume7
Issue number7
DOIs
Publication statusPublished - 2011

Fingerprint

Entropy
Vector spaces
Information retrieval
Experiments

Keywords

  • Concept hierarchy
  • Dominating set problem (DSP)
  • Information retrieval
  • Term reduction
  • Transition point (TP)
  • Vector space model (VSM)

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

A comparative analysis of the entropy and transition point approach in representing index terms of literary text. / Rahman, Hayati Abd; Mohd Noah, Shahrul Azman.

In: Journal of Computer Science, Vol. 7, No. 7, 2011, p. 1088-1093.

Research output: Contribution to journalArticle

@article{089ec3b6bab9488ebf13dea79a0f5381,
title = "A comparative analysis of the entropy and transition point approach in representing index terms of literary text",
abstract = "Problem statement: Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. Approach: One of the important tasks in construction of concept hierarchy is identification of suitable terms with appropriate size of domain vocabulary. Results: One way of achieving such a size is by using term reduction. The aim of this study is to examine the effectiveness of reduction approach to reduce size of vocabulary using term selection methods for literary text. The experiment compares entropy method, transition point method and hybrid of transition point and entropy methods with the Vector Space Model (VSM). Conclusion/Recommendations: Results indicate the effectiveness of Transition Point method as compared to the others in reducing size of vocabulary but at same time preserve those important terms that exist in the literary documents.",
keywords = "Concept hierarchy, Dominating set problem (DSP), Information retrieval, Term reduction, Transition point (TP), Vector space model (VSM)",
author = "Rahman, {Hayati Abd} and {Mohd Noah}, {Shahrul Azman}",
year = "2011",
doi = "10.3844/jcssp.2011.1088.1093",
language = "English",
volume = "7",
pages = "1088--1093",
journal = "Journal of Computer Science",
issn = "1549-3636",
publisher = "Science Publications",
number = "7",

}

TY - JOUR

T1 - A comparative analysis of the entropy and transition point approach in representing index terms of literary text

AU - Rahman, Hayati Abd

AU - Mohd Noah, Shahrul Azman

PY - 2011

Y1 - 2011

N2 - Problem statement: Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. Approach: One of the important tasks in construction of concept hierarchy is identification of suitable terms with appropriate size of domain vocabulary. Results: One way of achieving such a size is by using term reduction. The aim of this study is to examine the effectiveness of reduction approach to reduce size of vocabulary using term selection methods for literary text. The experiment compares entropy method, transition point method and hybrid of transition point and entropy methods with the Vector Space Model (VSM). Conclusion/Recommendations: Results indicate the effectiveness of Transition Point method as compared to the others in reducing size of vocabulary but at same time preserve those important terms that exist in the literary documents.

AB - Problem statement: Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. Approach: One of the important tasks in construction of concept hierarchy is identification of suitable terms with appropriate size of domain vocabulary. Results: One way of achieving such a size is by using term reduction. The aim of this study is to examine the effectiveness of reduction approach to reduce size of vocabulary using term selection methods for literary text. The experiment compares entropy method, transition point method and hybrid of transition point and entropy methods with the Vector Space Model (VSM). Conclusion/Recommendations: Results indicate the effectiveness of Transition Point method as compared to the others in reducing size of vocabulary but at same time preserve those important terms that exist in the literary documents.

KW - Concept hierarchy

KW - Dominating set problem (DSP)

KW - Information retrieval

KW - Term reduction

KW - Transition point (TP)

KW - Vector space model (VSM)

UR - http://www.scopus.com/inward/record.url?scp=80053091257&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053091257&partnerID=8YFLogxK

U2 - 10.3844/jcssp.2011.1088.1093

DO - 10.3844/jcssp.2011.1088.1093

M3 - Article

AN - SCOPUS:80053091257

VL - 7

SP - 1088

EP - 1093

JO - Journal of Computer Science

JF - Journal of Computer Science

SN - 1549-3636

IS - 7

ER -