Feature selection based on chi-square and ant colony optimization for multi-label classification

Joan Angelina Widians, Retantyo Wardoyo, Sri Hartati

Abstract


Text classification is widely used in organizations with large databases and digital documents. In text classification, there are many features, most of which are redundant. High-dimensional features impact multi-label classification performance. Feature selection is a data processing technique that can overcome this problem. Feature selection techniques have two major approaches: filter and wrapper. This paper proposes a hybrid filter-wrapper technique combining two algorithms: Chi-square (CS) and ant colony optimization (ACO). In the first stage, CS is used to reduce the number of irrelevant features. The ACO method is in the second stage. The ACO is applied to select the efficient features and improve classifier performance. The experiment results show that CS-ACO, CS-grey wolf optimizer (GWO), CS, and without feature selection (FS) have a micro F1-score based multinomial naïve Bayes classifier including 80%, 79.75%, 79.64% and 77.78%. The result indicates that the CS-ACO algorithm is suitable for solving multi-label classification problems.

Keywords


Ant colony optimization; Chi-square; Feature selection; Machine learning; Multi-label classification

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v14i3.pp3303-3312

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).