Arabic person names recognition by using a rule based approach

Mohammed Aboaoga, Mohd Juzaiddin Ab Aziz

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

Name Entity Recognition is very important task in many natural language processing applications such as; Machine Translation, Question Answering, Information Extraction, Text Summarization, Semantic Applications and Word Sense Disambiguation. Rule-based approach is one of the techniques that are used for named entity recognition to identify the named entities such as a person names, location names and organization names. The recent rule-based methods have been applied to recognize the person names in political domain. They ignored the recognition of other named entity types such as locations and organizations. We have used the rule based approach for recognizing the named entity type (person names) for Arabic. We have developed four rules for identifying the person names depending on the position of name. We have used an in-house Arabic corpus collected from newspaper achieves. The evaluation method that compares the results of the system with the manually annotated text has been applied in order to compute precision, recall and f-measure. In the experiment of this study, the average f-measure for recognizing person names are (92.66, 92.04 and 90.43%) in sport, economic and politic domain respectively. The experimental results showed that our rule-based method achieved the highest f-measure values in sport domain comparing with political and economic domains.

Original languageEnglish
Pages (from-to)922-927
Number of pages6
JournalJournal of Computer Science
Volume9
Issue number7
DOIs
Publication statusPublished - 2013

Fingerprint

Sports
Economics
Semantics
Processing
Experiments

Keywords

  • Arabic morphological analyzer
  • Named entity
  • Named entity recognition
  • Rule-based approach

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Arabic person names recognition by using a rule based approach. / Aboaoga, Mohammed; Ab Aziz, Mohd Juzaiddin.

In: Journal of Computer Science, Vol. 9, No. 7, 2013, p. 922-927.

Research output: Contribution to journalArticle

@article{2489b74312f6470ea41503eb92f37a57,
title = "Arabic person names recognition by using a rule based approach",
abstract = "Name Entity Recognition is very important task in many natural language processing applications such as; Machine Translation, Question Answering, Information Extraction, Text Summarization, Semantic Applications and Word Sense Disambiguation. Rule-based approach is one of the techniques that are used for named entity recognition to identify the named entities such as a person names, location names and organization names. The recent rule-based methods have been applied to recognize the person names in political domain. They ignored the recognition of other named entity types such as locations and organizations. We have used the rule based approach for recognizing the named entity type (person names) for Arabic. We have developed four rules for identifying the person names depending on the position of name. We have used an in-house Arabic corpus collected from newspaper achieves. The evaluation method that compares the results of the system with the manually annotated text has been applied in order to compute precision, recall and f-measure. In the experiment of this study, the average f-measure for recognizing person names are (92.66, 92.04 and 90.43{\%}) in sport, economic and politic domain respectively. The experimental results showed that our rule-based method achieved the highest f-measure values in sport domain comparing with political and economic domains.",
keywords = "Arabic morphological analyzer, Named entity, Named entity recognition, Rule-based approach",
author = "Mohammed Aboaoga and {Ab Aziz}, {Mohd Juzaiddin}",
year = "2013",
doi = "10.3844/jcssp.2013.922.927",
language = "English",
volume = "9",
pages = "922--927",
journal = "Journal of Computer Science",
issn = "1549-3636",
publisher = "Science Publications",
number = "7",

}

TY - JOUR

T1 - Arabic person names recognition by using a rule based approach

AU - Aboaoga, Mohammed

AU - Ab Aziz, Mohd Juzaiddin

PY - 2013

Y1 - 2013

N2 - Name Entity Recognition is very important task in many natural language processing applications such as; Machine Translation, Question Answering, Information Extraction, Text Summarization, Semantic Applications and Word Sense Disambiguation. Rule-based approach is one of the techniques that are used for named entity recognition to identify the named entities such as a person names, location names and organization names. The recent rule-based methods have been applied to recognize the person names in political domain. They ignored the recognition of other named entity types such as locations and organizations. We have used the rule based approach for recognizing the named entity type (person names) for Arabic. We have developed four rules for identifying the person names depending on the position of name. We have used an in-house Arabic corpus collected from newspaper achieves. The evaluation method that compares the results of the system with the manually annotated text has been applied in order to compute precision, recall and f-measure. In the experiment of this study, the average f-measure for recognizing person names are (92.66, 92.04 and 90.43%) in sport, economic and politic domain respectively. The experimental results showed that our rule-based method achieved the highest f-measure values in sport domain comparing with political and economic domains.

AB - Name Entity Recognition is very important task in many natural language processing applications such as; Machine Translation, Question Answering, Information Extraction, Text Summarization, Semantic Applications and Word Sense Disambiguation. Rule-based approach is one of the techniques that are used for named entity recognition to identify the named entities such as a person names, location names and organization names. The recent rule-based methods have been applied to recognize the person names in political domain. They ignored the recognition of other named entity types such as locations and organizations. We have used the rule based approach for recognizing the named entity type (person names) for Arabic. We have developed four rules for identifying the person names depending on the position of name. We have used an in-house Arabic corpus collected from newspaper achieves. The evaluation method that compares the results of the system with the manually annotated text has been applied in order to compute precision, recall and f-measure. In the experiment of this study, the average f-measure for recognizing person names are (92.66, 92.04 and 90.43%) in sport, economic and politic domain respectively. The experimental results showed that our rule-based method achieved the highest f-measure values in sport domain comparing with political and economic domains.

KW - Arabic morphological analyzer

KW - Named entity

KW - Named entity recognition

KW - Rule-based approach

UR - http://www.scopus.com/inward/record.url?scp=84880154043&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880154043&partnerID=8YFLogxK

U2 - 10.3844/jcssp.2013.922.927

DO - 10.3844/jcssp.2013.922.927

M3 - Article

AN - SCOPUS:84880154043

VL - 9

SP - 922

EP - 927

JO - Journal of Computer Science

JF - Journal of Computer Science

SN - 1549-3636

IS - 7

ER -