Classification of heterogeneous Malayalam documents based on structural features using deep learning models

Bipin Nair Balakrishnan Jayakumari, Amel Thomas Kavana

Abstract


The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manually collected from different sources. The method of research proceeds with preprocessing, feature extraction, and classification. The proposed work deals with three fine-tuned deep learning models such as visual geometry group-16 (VGG-16), convolutional neural network (CNN) and AlexNet. The models attained high accuracies of 99.7%, 96%, and 95%, respectively. Among the three models, the fine-tuned VGG-16 model was found to perform better attaining a very high accuracy on the dataset. As a future work, methods to classify the documents based on content as well as spectral features can be developed.

Keywords


Classification; Deep learning; Documents; AlexNet; Preprocessing

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v13i1.pp894-901

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).