Segmentation of Arabic characters: A comprehensive survey

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The cursive nature of Arabic writing is the main challenge to Arabic Optical Character Recognition developer. Methods to segment Arabic words into characters have been proposed. This paper provides a comprehensive review of the methods proposed by researchers to segment Arabic characters. The segmentation methods are categorized into nine different methods based on techniques used. The advantages and drawbacks of each are presented and discussed. Most researchers did not report the segmentation accuracy in their research; instead, they reported the overall recognition rate which did not reflect the influence of each sub-stage on the final recognition rate. The size of the training/testing data was not large enough to be generalized. The field of Arabic Character Recognition needs a standard set of test documents in both image and character formats, together with the ground truth and a set of performance evaluation tools, which would enable comparing the performance of different algorithms. As each method has its strengths, a hybrid segmentation approach is a promising method. The paper concludes that there is still no perfect segmentation method for ACR and much opportunity for research in this area.

Original languageEnglish
Title of host publicationTechnology Diffusion and Adoption: Global Complexity, Global Innovation
PublisherIGI Global
Pages251-288
Number of pages38
ISBN (Print)9781466627925, 1466627913, 9781466627918
DOIs
Publication statusPublished - 31 Jan 2013

Fingerprint

Optical character recognition
Character recognition
Testing
segmentation
performance
evaluation

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science(all)
  • Social Sciences(all)

Cite this

Zeki, A. M., Zakaria, M. S., & Liong, C. Y. (2013). Segmentation of Arabic characters: A comprehensive survey. In Technology Diffusion and Adoption: Global Complexity, Global Innovation (pp. 251-288). IGI Global. https://doi.org/10.4018/978-1-4666-2791-8.ch0161

Segmentation of Arabic characters : A comprehensive survey. / Zeki, Ahmed M.; Zakaria, Mohamad Shanudin; Liong, Choong Yeun.

Technology Diffusion and Adoption: Global Complexity, Global Innovation. IGI Global, 2013. p. 251-288.

Research output: Chapter in Book/Report/Conference proceedingChapter

Zeki, AM, Zakaria, MS & Liong, CY 2013, Segmentation of Arabic characters: A comprehensive survey. in Technology Diffusion and Adoption: Global Complexity, Global Innovation. IGI Global, pp. 251-288. https://doi.org/10.4018/978-1-4666-2791-8.ch0161
Zeki AM, Zakaria MS, Liong CY. Segmentation of Arabic characters: A comprehensive survey. In Technology Diffusion and Adoption: Global Complexity, Global Innovation. IGI Global. 2013. p. 251-288 https://doi.org/10.4018/978-1-4666-2791-8.ch0161
Zeki, Ahmed M. ; Zakaria, Mohamad Shanudin ; Liong, Choong Yeun. / Segmentation of Arabic characters : A comprehensive survey. Technology Diffusion and Adoption: Global Complexity, Global Innovation. IGI Global, 2013. pp. 251-288
@inbook{e6ce4e3c2d174197bf0a2e3b19e46053,
title = "Segmentation of Arabic characters: A comprehensive survey",
abstract = "The cursive nature of Arabic writing is the main challenge to Arabic Optical Character Recognition developer. Methods to segment Arabic words into characters have been proposed. This paper provides a comprehensive review of the methods proposed by researchers to segment Arabic characters. The segmentation methods are categorized into nine different methods based on techniques used. The advantages and drawbacks of each are presented and discussed. Most researchers did not report the segmentation accuracy in their research; instead, they reported the overall recognition rate which did not reflect the influence of each sub-stage on the final recognition rate. The size of the training/testing data was not large enough to be generalized. The field of Arabic Character Recognition needs a standard set of test documents in both image and character formats, together with the ground truth and a set of performance evaluation tools, which would enable comparing the performance of different algorithms. As each method has its strengths, a hybrid segmentation approach is a promising method. The paper concludes that there is still no perfect segmentation method for ACR and much opportunity for research in this area.",
author = "Zeki, {Ahmed M.} and Zakaria, {Mohamad Shanudin} and Liong, {Choong Yeun}",
year = "2013",
month = "1",
day = "31",
doi = "10.4018/978-1-4666-2791-8.ch0161",
language = "English",
isbn = "9781466627925",
pages = "251--288",
booktitle = "Technology Diffusion and Adoption: Global Complexity, Global Innovation",
publisher = "IGI Global",

}

TY - CHAP

T1 - Segmentation of Arabic characters

T2 - A comprehensive survey

AU - Zeki, Ahmed M.

AU - Zakaria, Mohamad Shanudin

AU - Liong, Choong Yeun

PY - 2013/1/31

Y1 - 2013/1/31

N2 - The cursive nature of Arabic writing is the main challenge to Arabic Optical Character Recognition developer. Methods to segment Arabic words into characters have been proposed. This paper provides a comprehensive review of the methods proposed by researchers to segment Arabic characters. The segmentation methods are categorized into nine different methods based on techniques used. The advantages and drawbacks of each are presented and discussed. Most researchers did not report the segmentation accuracy in their research; instead, they reported the overall recognition rate which did not reflect the influence of each sub-stage on the final recognition rate. The size of the training/testing data was not large enough to be generalized. The field of Arabic Character Recognition needs a standard set of test documents in both image and character formats, together with the ground truth and a set of performance evaluation tools, which would enable comparing the performance of different algorithms. As each method has its strengths, a hybrid segmentation approach is a promising method. The paper concludes that there is still no perfect segmentation method for ACR and much opportunity for research in this area.

AB - The cursive nature of Arabic writing is the main challenge to Arabic Optical Character Recognition developer. Methods to segment Arabic words into characters have been proposed. This paper provides a comprehensive review of the methods proposed by researchers to segment Arabic characters. The segmentation methods are categorized into nine different methods based on techniques used. The advantages and drawbacks of each are presented and discussed. Most researchers did not report the segmentation accuracy in their research; instead, they reported the overall recognition rate which did not reflect the influence of each sub-stage on the final recognition rate. The size of the training/testing data was not large enough to be generalized. The field of Arabic Character Recognition needs a standard set of test documents in both image and character formats, together with the ground truth and a set of performance evaluation tools, which would enable comparing the performance of different algorithms. As each method has its strengths, a hybrid segmentation approach is a promising method. The paper concludes that there is still no perfect segmentation method for ACR and much opportunity for research in this area.

UR - http://www.scopus.com/inward/record.url?scp=84949423690&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949423690&partnerID=8YFLogxK

U2 - 10.4018/978-1-4666-2791-8.ch0161

DO - 10.4018/978-1-4666-2791-8.ch0161

M3 - Chapter

AN - SCOPUS:84949423690

SN - 9781466627925

SN - 1466627913

SN - 9781466627918

SP - 251

EP - 288

BT - Technology Diffusion and Adoption: Global Complexity, Global Innovation

PB - IGI Global

ER -