The pertinent single-attribute-based classifier for small datasets classification
Abstract
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attribute-based-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Keywords
Classification; Feature selection; OneR classifier; Single-attribute-based classifier; Small dataset
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v10i3.pp3227-3234
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).