NER in english translation of hadith documents using classifiers combination

Mohanad Jasim Jaber, Saidah Saad

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

There is a need to retrieve and extract important information in order to fully understanding the everincreasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3% in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.

Original languageEnglish
Pages (from-to)348-354
Number of pages7
JournalJournal of Theoretical and Applied Information Technology
Volume84
Issue number3
Publication statusPublished - 29 Feb 2016

Fingerprint

Classifier Combination
Named Entity Recognition
Classifiers
Classifier
Majority Voting
Naive Bayes
Maximum Entropy
Supervised Learning
Voting
Baseline
Support Vector Machine
Machine Learning
Model
Support vector machines
Cover
Learning systems
Entropy
Text

Keywords

  • Hadith text
  • Named entity recognition
  • Supervised machine learning

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

NER in english translation of hadith documents using classifiers combination. / Jaber, Mohanad Jasim; Saad, Saidah.

In: Journal of Theoretical and Applied Information Technology, Vol. 84, No. 3, 29.02.2016, p. 348-354.

Research output: Contribution to journalArticle

@article{d6329a7bed1f4e3bbfe1452d47d5a92e,
title = "NER in english translation of hadith documents using classifiers combination",
abstract = "There is a need to retrieve and extract important information in order to fully understanding the everincreasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3{\%} in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.",
keywords = "Hadith text, Named entity recognition, Supervised machine learning",
author = "Jaber, {Mohanad Jasim} and Saidah Saad",
year = "2016",
month = "2",
day = "29",
language = "English",
volume = "84",
pages = "348--354",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "3",

}

TY - JOUR

T1 - NER in english translation of hadith documents using classifiers combination

AU - Jaber, Mohanad Jasim

AU - Saad, Saidah

PY - 2016/2/29

Y1 - 2016/2/29

N2 - There is a need to retrieve and extract important information in order to fully understanding the everincreasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3% in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.

AB - There is a need to retrieve and extract important information in order to fully understanding the everincreasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3% in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.

KW - Hadith text

KW - Named entity recognition

KW - Supervised machine learning

UR - http://www.scopus.com/inward/record.url?scp=84959308753&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959308753&partnerID=8YFLogxK

M3 - Article

VL - 84

SP - 348

EP - 354

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 3

ER -