Combined cosine-linear regression model similarity with application to handwritten word spotting

Youssef Elfakir, Ghizlane Khaissidi, Mostafa Mrabti, Driss Chenouni, Manal Boualam

Abstract


The similarity or the distance measure have been used widely to calculate the similarity or dissimilarity between vector sequences, where the document images similarity is known as the domain that dealing with image information and both similarity/distance has been an important role for matching and pattern recognition. There are several types of similarity measure, we cover in this paper the survey of various distance measures used in the images matching and we explain the limitations associated with the existing distances. Then, we introduce the concept of the floating distance which describes the variation of the threshold’s selection for each word in decision making process, based on a combination of Linear Regression and cosine distance. Experiments are carried out on a handwritten Arabic image documents of Gallica library. These experiments show that the proposed floating distance outperforms the traditional distance in word spotting system.

Keywords


Handwritten Arabic Documents; Similarity distance Bag-of-visual word; Features extractions; Floating threshold

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v10i3.pp2367-2374

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).