The Selection of Useful Visual Words for Class-Imbalanced Data in Image Classification

Sutasinee Chimlek, Part Pramokchon, Punpiti Piamsa-nga

Abstract


The bag of visual words (BOVW) has recently been used for image classification in large datasets. A major problem of image classification using BOVW is high dimensionality, with most features usually being irrelevant and different BOVW for multi-view images in each class. Therefore, the selection of significant visual words for multi-view images in each class is an essential method to reduce the size of BOVW while retaining the high performance of image classification. Many feature scores for ranking produce low classification performance for class imbalanced distributions and multi-views in each class. We propose a feature score based on the statistical t-test technique, which is a statistical evaluation of the difference between two sample means, to assess the discriminating power of each individual feature. The multi-class image classification performance of the proposed feature score is compared with four modern feature scores, such as Document Frequency (DF), Mutual information (MI), Pointwise Mutual information (PMI) and Chi-square statistics (CHI). The results show that the average F1-measure performance on the Paris dataset and the SUN397 dataset using the proposed feature score are 92% and 94%, respectively, while all other feature scores do not exceed 80%.

Keywords


Image Classification; Feature Selection; Bag of Visual Words

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v6i1.pp307-319

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).