Recognition of compound characters in Kannada language

Sridevi Tumkur Narasimhaiah, Lalitha Rangarajan

Abstract


Recognition of degraded printed compound Kannada characters is a challenging research problem. It has been verified experimentally that noise removal is an essential preprocessing step. Proposed are two methods for degraded Kannada character recognition problem. Method 1 is conventionally used histogram of oriented gradients (HOG) feature extraction for character recognition problem. Extracted features are transformed and reduced using principal component analysis (PCA) and classification performed. Various classifiers are experimented with. Simple compound character classification is satisfactory (more than 98% accuracy) with this method. However, the method does not perform well on other two compound types. Method 2 is deep convolutional neural networks (CNN) model for classification. This outperforms HOG features and classification. The highest classification accuracy is found as 98.8% for simple compound character classification. The performance of deep CNN is far better for other two compound types. Deep CNN turns out to better for pooled character classes.


Keywords


Deep convolutional neural networks classifier; Degraded character recognition; Histogram of oriented gradients; Old Kannada documents; Optical character recognition; Principal component analysis dimensionality reduction

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v12i6.pp6103-6113

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578