A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith

Soad Saleh Balgasem, Lailatul Qadri Zakaria

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.

Original languageEnglish
Title of host publicationProceedings of the 2017 6th International Conference on Electrical Engineering and Informatics
Subtitle of host publicationSustainable Society Through Digital Innovation, ICEEI 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-5
Number of pages5
Volume2017-November
ISBN (Electronic)9781538604755
DOIs
Publication statusPublished - 9 Mar 2018
Event6th International Conference on Electrical Engineering and Informatics, ICEEI 2017 - Langkawi, Malaysia
Duration: 25 Nov 201727 Nov 2017

Other

Other6th International Conference on Electrical Engineering and Informatics, ICEEI 2017
CountryMalaysia
CityLangkawi
Period25/11/1727/11/17

Fingerprint

Hybrid Method
Names
Statistical methods
Statistical method
Log-likelihood Ratio
Person
Costs
Named Entity Recognition
Costs and Cost Analysis
Processing
Subfield
Tagging
Hybrid Approach
Mutual Information
Trigger
Natural Language Processing
Natural Language
Statistical Analysis
Preprocessing
Islam

Keywords

  • Classical Arabic
  • multi word expression
  • named entity recognition
  • Person name recognition

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Optimization
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Software
  • Electrical and Electronic Engineering
  • Health Informatics

Cite this

Balgasem, S. S., & Zakaria, L. Q. (2018). A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. In Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017 (Vol. 2017-November, pp. 1-5). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICEEI.2017.8312417

A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. / Balgasem, Soad Saleh; Zakaria, Lailatul Qadri.

Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November Institute of Electrical and Electronics Engineers Inc., 2018. p. 1-5.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Balgasem, SS & Zakaria, LQ 2018, A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. in Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. vol. 2017-November, Institute of Electrical and Electronics Engineers Inc., pp. 1-5, 6th International Conference on Electrical Engineering and Informatics, ICEEI 2017, Langkawi, Malaysia, 25/11/17. https://doi.org/10.1109/ICEEI.2017.8312417
Balgasem SS, Zakaria LQ. A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. In Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1-5 https://doi.org/10.1109/ICEEI.2017.8312417
Balgasem, Soad Saleh ; Zakaria, Lailatul Qadri. / A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1-5
@inproceedings{6e53d3bf53be4760a215d41d583c9faa,
title = "A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith",
abstract = "Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86{\%} of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85{\%} of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.",
keywords = "Classical Arabic, multi word expression, named entity recognition, Person name recognition",
author = "Balgasem, {Soad Saleh} and Zakaria, {Lailatul Qadri}",
year = "2018",
month = "3",
day = "9",
doi = "10.1109/ICEEI.2017.8312417",
language = "English",
volume = "2017-November",
pages = "1--5",
booktitle = "Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith

AU - Balgasem, Soad Saleh

AU - Zakaria, Lailatul Qadri

PY - 2018/3/9

Y1 - 2018/3/9

N2 - Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.

AB - Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.

KW - Classical Arabic

KW - multi word expression

KW - named entity recognition

KW - Person name recognition

UR - http://www.scopus.com/inward/record.url?scp=85050746929&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050746929&partnerID=8YFLogxK

U2 - 10.1109/ICEEI.2017.8312417

DO - 10.1109/ICEEI.2017.8312417

M3 - Conference contribution

VL - 2017-November

SP - 1

EP - 5

BT - Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics

PB - Institute of Electrical and Electronics Engineers Inc.

ER -