A Farsi/Arabic Word Spotting Approach for Printed Document Images

Abstract views: 65 / PDF downloads: 53

Authors

  • Yaghoub POURASAD
  • Houshang HASSIBI
  • Azam GHORBANI

Keywords:

Farsi document image, word spotting, word searching, word image retrieval

Abstract

Word spotting is finding and locating a query word through a dataset of document images. There are many papers about English (Latin) and
some papers about Arabic, but there isn’t any paper about Farsi word spotting. This paper is the first paper about it. In this paper using some
characteristics of Farsi scripts and some font size independent features such as number of sub words, and their aspect ratios, number of holes, dots,
ascenders and descenders, and a multi level matching process, instances of a query word is found through document images. This approach has
been applied on a dataset consisting of 400 Farsi document images in 4 font faces with font sizes from 8 up to 22, and precision rate 88.7% at a
recall rate of 78.5% has been obtained. Proposed approach is font size independent because used features are font size independent. This approach
is also applicable on Arabic and Urdu scripts.

Downloads

Published

2019-06-02

How to Cite

POURASAD, Y., HASSIBI, H., & GHORBANI, A. (2019). A Farsi/Arabic Word Spotting Approach for Printed Document Images. International Journal of Natural and Engineering Sciences, 6(1), 15–18. Retrieved from https://ijnes.org/index.php/ijnes/article/view/84

Issue

Section

Articles