Efficacy of Arabic named-entity recognition

Suhad Al-Shoukry, Nazlia Omar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.

Original languageEnglish
Title of host publicationProceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages506-510
Number of pages5
ISBN (Print)9781467373197
DOIs
Publication statusPublished - 10 Dec 2015
Event5th International Conference on Electrical Engineering and Informatics, ICEEI 2015 - Legian-Bali, Indonesia
Duration: 10 Aug 201511 Aug 2015

Other

Other5th International Conference on Electrical Engineering and Informatics, ICEEI 2015
CountryIndonesia
CityLegian-Bali
Period10/8/1511/8/15

Fingerprint

Object recognition
Transcription
Acoustic waves

Keywords

  • Arabic language
  • CRF (conditional random fields)
  • named entry recognition (NER)

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Al-Shoukry, S., & Omar, N. (2015). Efficacy of Arabic named-entity recognition. In Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015 (pp. 506-510). [7352553] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICEEI.2015.7352553

Efficacy of Arabic named-entity recognition. / Al-Shoukry, Suhad; Omar, Nazlia.

Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 506-510 7352553.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Al-Shoukry, S & Omar, N 2015, Efficacy of Arabic named-entity recognition. in Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015., 7352553, Institute of Electrical and Electronics Engineers Inc., pp. 506-510, 5th International Conference on Electrical Engineering and Informatics, ICEEI 2015, Legian-Bali, Indonesia, 10/8/15. https://doi.org/10.1109/ICEEI.2015.7352553
Al-Shoukry S, Omar N. Efficacy of Arabic named-entity recognition. In Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 506-510. 7352553 https://doi.org/10.1109/ICEEI.2015.7352553
Al-Shoukry, Suhad ; Omar, Nazlia. / Efficacy of Arabic named-entity recognition. Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 506-510
@inproceedings{4045b0a709e543abae74e266d73fa6fd,
title = "Efficacy of Arabic named-entity recognition",
abstract = "Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67{\%} performance improvement per sentence, as compared with existing methods.",
keywords = "Arabic language, CRF (conditional random fields), named entry recognition (NER)",
author = "Suhad Al-Shoukry and Nazlia Omar",
year = "2015",
month = "12",
day = "10",
doi = "10.1109/ICEEI.2015.7352553",
language = "English",
isbn = "9781467373197",
pages = "506--510",
booktitle = "Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Efficacy of Arabic named-entity recognition

AU - Al-Shoukry, Suhad

AU - Omar, Nazlia

PY - 2015/12/10

Y1 - 2015/12/10

N2 - Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.

AB - Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.

KW - Arabic language

KW - CRF (conditional random fields)

KW - named entry recognition (NER)

UR - http://www.scopus.com/inward/record.url?scp=84961737546&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961737546&partnerID=8YFLogxK

U2 - 10.1109/ICEEI.2015.7352553

DO - 10.1109/ICEEI.2015.7352553

M3 - Conference contribution

SN - 9781467373197

SP - 506

EP - 510

BT - Proceedings - 5th International Conference on Electrical Engineering and Informatics: Bridging the Knowledge between Academic, Industry, and Community, ICEEI 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -