Malay-English cross-language information retrieval: Compound words and proper names handling

Nurjannaton Hidayah Rais, Muhamad Taufik Abdullah, Abdul Kadir Rabiah

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cross language information retrieval (CLIR) deals with the use of queries in one language to access documents in another. A popular CLIR approach is to translate the query into the language of the documents being retrieved. The simplest yet most effective method for query translation is bilingual dictionary approach. However, direct translation using bilingual dictionary prune two problems: proper names and compound words translations. In this study, a series of experiments were conducted to test and evaluate the effectiveness of the Malay-English CLIR using proper names and compound words translation. We believe by using concept-based indexing and translations, makes proper names and compound words translation possible. The best retrieval performance was obtained from the combination of query translation approach-select all translations listed in the dictionary, the alternative weighting scheme and proper names identification and translation.

Original languageEnglish
Title of host publicatione-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings
Pages309-317
Number of pages9
Volume171 CCIS
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event1st International Conference on e-Technologies and Networks for Development, ICeND 2011 - Dar Es Salaam, Tanzania, United Republic of
Duration: 3 Aug 20115 Aug 2011

Publication series

NameCommunications in Computer and Information Science
Volume171 CCIS
ISSN (Print)18650929

Other

Other1st International Conference on e-Technologies and Networks for Development, ICeND 2011
CountryTanzania, United Republic of
CityDar Es Salaam
Period3/8/115/8/11

Fingerprint

Query languages
Glossaries
Experiments

Keywords

  • Bilingual dictionary
  • Concept-based IR
  • Cross-language information retrieval
  • Proper names identification and translation
  • Query translation

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Hidayah Rais, N., Abdullah, M. T., & Rabiah, A. K. (2011). Malay-English cross-language information retrieval: Compound words and proper names handling. In e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings (Vol. 171 CCIS, pp. 309-317). (Communications in Computer and Information Science; Vol. 171 CCIS). https://doi.org/10.1007/978-3-642-22729-5_26

Malay-English cross-language information retrieval : Compound words and proper names handling. / Hidayah Rais, Nurjannaton; Abdullah, Muhamad Taufik; Rabiah, Abdul Kadir.

e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings. Vol. 171 CCIS 2011. p. 309-317 (Communications in Computer and Information Science; Vol. 171 CCIS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hidayah Rais, N, Abdullah, MT & Rabiah, AK 2011, Malay-English cross-language information retrieval: Compound words and proper names handling. in e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings. vol. 171 CCIS, Communications in Computer and Information Science, vol. 171 CCIS, pp. 309-317, 1st International Conference on e-Technologies and Networks for Development, ICeND 2011, Dar Es Salaam, Tanzania, United Republic of, 3/8/11. https://doi.org/10.1007/978-3-642-22729-5_26
Hidayah Rais N, Abdullah MT, Rabiah AK. Malay-English cross-language information retrieval: Compound words and proper names handling. In e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings. Vol. 171 CCIS. 2011. p. 309-317. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-642-22729-5_26
Hidayah Rais, Nurjannaton ; Abdullah, Muhamad Taufik ; Rabiah, Abdul Kadir. / Malay-English cross-language information retrieval : Compound words and proper names handling. e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings. Vol. 171 CCIS 2011. pp. 309-317 (Communications in Computer and Information Science).
@inproceedings{20610cbe1360401f9f974da03fc52ce9,
title = "Malay-English cross-language information retrieval: Compound words and proper names handling",
abstract = "Cross language information retrieval (CLIR) deals with the use of queries in one language to access documents in another. A popular CLIR approach is to translate the query into the language of the documents being retrieved. The simplest yet most effective method for query translation is bilingual dictionary approach. However, direct translation using bilingual dictionary prune two problems: proper names and compound words translations. In this study, a series of experiments were conducted to test and evaluate the effectiveness of the Malay-English CLIR using proper names and compound words translation. We believe by using concept-based indexing and translations, makes proper names and compound words translation possible. The best retrieval performance was obtained from the combination of query translation approach-select all translations listed in the dictionary, the alternative weighting scheme and proper names identification and translation.",
keywords = "Bilingual dictionary, Concept-based IR, Cross-language information retrieval, Proper names identification and translation, Query translation",
author = "{Hidayah Rais}, Nurjannaton and Abdullah, {Muhamad Taufik} and Rabiah, {Abdul Kadir}",
year = "2011",
doi = "10.1007/978-3-642-22729-5_26",
language = "English",
isbn = "9783642227288",
volume = "171 CCIS",
series = "Communications in Computer and Information Science",
pages = "309--317",
booktitle = "e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings",

}

TY - GEN

T1 - Malay-English cross-language information retrieval

T2 - Compound words and proper names handling

AU - Hidayah Rais, Nurjannaton

AU - Abdullah, Muhamad Taufik

AU - Rabiah, Abdul Kadir

PY - 2011

Y1 - 2011

N2 - Cross language information retrieval (CLIR) deals with the use of queries in one language to access documents in another. A popular CLIR approach is to translate the query into the language of the documents being retrieved. The simplest yet most effective method for query translation is bilingual dictionary approach. However, direct translation using bilingual dictionary prune two problems: proper names and compound words translations. In this study, a series of experiments were conducted to test and evaluate the effectiveness of the Malay-English CLIR using proper names and compound words translation. We believe by using concept-based indexing and translations, makes proper names and compound words translation possible. The best retrieval performance was obtained from the combination of query translation approach-select all translations listed in the dictionary, the alternative weighting scheme and proper names identification and translation.

AB - Cross language information retrieval (CLIR) deals with the use of queries in one language to access documents in another. A popular CLIR approach is to translate the query into the language of the documents being retrieved. The simplest yet most effective method for query translation is bilingual dictionary approach. However, direct translation using bilingual dictionary prune two problems: proper names and compound words translations. In this study, a series of experiments were conducted to test and evaluate the effectiveness of the Malay-English CLIR using proper names and compound words translation. We believe by using concept-based indexing and translations, makes proper names and compound words translation possible. The best retrieval performance was obtained from the combination of query translation approach-select all translations listed in the dictionary, the alternative weighting scheme and proper names identification and translation.

KW - Bilingual dictionary

KW - Concept-based IR

KW - Cross-language information retrieval

KW - Proper names identification and translation

KW - Query translation

UR - http://www.scopus.com/inward/record.url?scp=80051571574&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051571574&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-22729-5_26

DO - 10.1007/978-3-642-22729-5_26

M3 - Conference contribution

AN - SCOPUS:80051571574

SN - 9783642227288

VL - 171 CCIS

T3 - Communications in Computer and Information Science

SP - 309

EP - 317

BT - e-Technologies and Networks for Development - First International Conference, ICeND 2011, Proceedings

ER -