Arabic term extraction using combined approach on Islamic document

Ali Mashaan Abed, Sabrina Tiun, Mohammed Albared

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

While a wide range of methods has been conducted to English terminology extraction, relatively few studies have been applied to Arabic terms extraction in Islamic corpus. In this paper, we present an efficient approach for automatic extraction of Arabic Terminology (SWTs, MWTs). The approach relies on two main filtering steps: the linguistic filter, where simple part of speech (POS) tagger is used to extract candidate MWTs matching given syntactic patterns, and the statistical filter where several statistical methods (PMI, Kappa, CHI-squire, T-test, Piatersky- Shapiro and Rank Aggregation) are used to rank candidate MWTs and we applied IF.IDF to rank the SWTs candidate. Our approach extracted the bi-gram candidates of MWTs Islamic term from corpus and evaluated the association measures (STWs and MWTs) by using the n-best evaluation method.

Original languageEnglish
Pages (from-to)601-608
Number of pages8
JournalJournal of Theoretical and Applied Information Technology
Volume58
Issue number3
DOIs
Publication statusPublished - 2013

Fingerprint

Terminology
Term
Rank Aggregation
Association Measure
Filter
Syntactics
Evaluation Method
Linguistics
Statistical method
Statistical methods
Agglomeration
Filtering
Range of data
Corpus
Speech
Syntax

Keywords

  • Association measures
  • MWTs
  • n-best evaluation
  • SWTs
  • Term extraction

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Arabic term extraction using combined approach on Islamic document. / Abed, Ali Mashaan; Tiun, Sabrina; Albared, Mohammed.

In: Journal of Theoretical and Applied Information Technology, Vol. 58, No. 3, 2013, p. 601-608.

Research output: Contribution to journalArticle

@article{5901460b4e7848d9a582ea4019defc4a,
title = "Arabic term extraction using combined approach on Islamic document",
abstract = "While a wide range of methods has been conducted to English terminology extraction, relatively few studies have been applied to Arabic terms extraction in Islamic corpus. In this paper, we present an efficient approach for automatic extraction of Arabic Terminology (SWTs, MWTs). The approach relies on two main filtering steps: the linguistic filter, where simple part of speech (POS) tagger is used to extract candidate MWTs matching given syntactic patterns, and the statistical filter where several statistical methods (PMI, Kappa, CHI-squire, T-test, Piatersky- Shapiro and Rank Aggregation) are used to rank candidate MWTs and we applied IF.IDF to rank the SWTs candidate. Our approach extracted the bi-gram candidates of MWTs Islamic term from corpus and evaluated the association measures (STWs and MWTs) by using the n-best evaluation method.",
keywords = "Association measures, MWTs, n-best evaluation, SWTs, Term extraction",
author = "Abed, {Ali Mashaan} and Sabrina Tiun and Mohammed Albared",
year = "2013",
doi = "http://www.jatit.org/volumes/Vol58No3/15Vol58No3.pdf",
language = "English",
volume = "58",
pages = "601--608",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "3",

}

TY - JOUR

T1 - Arabic term extraction using combined approach on Islamic document

AU - Abed, Ali Mashaan

AU - Tiun, Sabrina

AU - Albared, Mohammed

PY - 2013

Y1 - 2013

N2 - While a wide range of methods has been conducted to English terminology extraction, relatively few studies have been applied to Arabic terms extraction in Islamic corpus. In this paper, we present an efficient approach for automatic extraction of Arabic Terminology (SWTs, MWTs). The approach relies on two main filtering steps: the linguistic filter, where simple part of speech (POS) tagger is used to extract candidate MWTs matching given syntactic patterns, and the statistical filter where several statistical methods (PMI, Kappa, CHI-squire, T-test, Piatersky- Shapiro and Rank Aggregation) are used to rank candidate MWTs and we applied IF.IDF to rank the SWTs candidate. Our approach extracted the bi-gram candidates of MWTs Islamic term from corpus and evaluated the association measures (STWs and MWTs) by using the n-best evaluation method.

AB - While a wide range of methods has been conducted to English terminology extraction, relatively few studies have been applied to Arabic terms extraction in Islamic corpus. In this paper, we present an efficient approach for automatic extraction of Arabic Terminology (SWTs, MWTs). The approach relies on two main filtering steps: the linguistic filter, where simple part of speech (POS) tagger is used to extract candidate MWTs matching given syntactic patterns, and the statistical filter where several statistical methods (PMI, Kappa, CHI-squire, T-test, Piatersky- Shapiro and Rank Aggregation) are used to rank candidate MWTs and we applied IF.IDF to rank the SWTs candidate. Our approach extracted the bi-gram candidates of MWTs Islamic term from corpus and evaluated the association measures (STWs and MWTs) by using the n-best evaluation method.

KW - Association measures

KW - MWTs

KW - n-best evaluation

KW - SWTs

KW - Term extraction

UR - http://www.scopus.com/inward/record.url?scp=84891672667&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84891672667&partnerID=8YFLogxK

U2 - http://www.jatit.org/volumes/Vol58No3/15Vol58No3.pdf

DO - http://www.jatit.org/volumes/Vol58No3/15Vol58No3.pdf

M3 - Article

VL - 58

SP - 601

EP - 608

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 3

ER -