Abstract
Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics |
Subtitle of host publication | Sustainable Society Through Digital Innovation, ICEEI 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1-5 |
Number of pages | 5 |
Volume | 2017-November |
ISBN (Electronic) | 9781538604755 |
DOIs | |
Publication status | Published - 9 Mar 2018 |
Event | 6th International Conference on Electrical Engineering and Informatics, ICEEI 2017 - Langkawi, Malaysia Duration: 25 Nov 2017 → 27 Nov 2017 |
Other
Other | 6th International Conference on Electrical Engineering and Informatics, ICEEI 2017 |
---|---|
Country | Malaysia |
City | Langkawi |
Period | 25/11/17 → 27/11/17 |
Fingerprint
Keywords
- Classical Arabic
- multi word expression
- named entity recognition
- Person name recognition
ASJC Scopus subject areas
- Artificial Intelligence
- Control and Optimization
- Computer Networks and Communications
- Computer Vision and Pattern Recognition
- Information Systems
- Software
- Electrical and Electronic Engineering
- Health Informatics
Cite this
A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. / Balgasem, Soad Saleh; Zakaria, Lailatul Qadri.
Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November Institute of Electrical and Electronics Engineers Inc., 2018. p. 1-5.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith
AU - Balgasem, Soad Saleh
AU - Zakaria, Lailatul Qadri
PY - 2018/3/9
Y1 - 2018/3/9
N2 - Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.
AB - Hadith is one of the two main fundamental resources for Muslims which contains a collection of quotes that have been said by Prophet Mohammed. In order to validate Hadith, there are two main factors that can identify the strengthen of certain hadith which are the context (content of hadith itself) and the narrators (the persons who narrate this hadith). Identifying narrators' names plays an essential role in terms of validating specific hadith. In this research, we have used Named Entity Recognition which is a subfield of Natural Language Processing. Person name may yield a multi-word that indicates his first name such as Abdullah or in Arabic. This means that recognizing Arabic names requires to be treated as a multi-word expressions. Hence, this research addresses a method to recognize Arabic names from Hadith by a combination of rule-based and statistical methods. The process of this study consists of six phases which are dataset, transformation, pre-processing, Part-Of-Speech tagging, rule based method and statistical methods. The rule-based method is relying on a set of keywords which will trigger the start and end position of a narrator's name candidate. After the narator's name candidate is identified, it will be submitted to the statistical analysis to evaluate the possibility of the candidate as a narrator's name. The statistical measures that have been used are consisting of Log-likelihood Ratio (LLR), Point-wise Mutual Information (PMI), S-cost, R-cost and U-cost. The experimental results have reported an 86% of f-measure for the rule-based method, while LLR has outperformed the other statistical methods by obtaining an 85% of precision. In conclusion, the hybrid approach of rule based and statistical methods have provide a better result compared to relying only rule based method in recognizing narrator name is hadith.
KW - Classical Arabic
KW - multi word expression
KW - named entity recognition
KW - Person name recognition
UR - http://www.scopus.com/inward/record.url?scp=85050746929&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050746929&partnerID=8YFLogxK
U2 - 10.1109/ICEEI.2017.8312417
DO - 10.1109/ICEEI.2017.8312417
M3 - Conference contribution
AN - SCOPUS:85050746929
VL - 2017-November
SP - 1
EP - 5
BT - Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics
PB - Institute of Electrical and Electronics Engineers Inc.
ER -