Breast cancer identification using a hybrid machine learning system

Toni Arifin, Ignatius Wiseto Prasetyo Agung, Erfian Junianto, Dari Dianata Agustin, Ilham Rachmat Wibowo, Rizal Rachman

Abstract


Breast cancer remains one of the most prevalent malignancies among women and is frequently diagnosed at an advanced stage. Early detection is critical to improving patient prognosis and survival rates. Messenger ribonucleic acid (mRNA) gene expression data, which captures the molecular alterations in cancer cells, offers a promising avenue for enhancing diagnostic accuracy. The objective of this study is to develop a machine learning-based model for breast cancer detection using mRNA gene expression profiles. To achieve this, we implemented a hybrid machine learning system (HMLS) that integrates classification algorithms with feature selection and extraction techniques. This approach enables the effective handling of heterogeneous and high-dimensional genomic data, such as mRNA expression datasets, while simultaneously reducing dimensionality without sacrificing critical information. The classification algorithms applied in this study include support vector machine (SVM), random forest (RF), naïve Bayes (NB), k-nearest neighbors (KNN), extra trees classifier (ETC), and logistic regression (LR). Feature selection was conducted using analysis of variance (ANOVA), mutual information (MI), ETC, LR, whereas principal component analysis (PCA) was employed for feature extraction. The performance of the proposed model was evaluated using standard metrics, including recall, F1-score, and accuracy. Experimental results demonstrate that the combination of the SVM classifier with MI feature selection outperformed other configurations and conventional machine learning approaches, achieving a classification accuracy of 99.4%.

Keywords


Breast cancer; Feature extraction; Feature selection; Gene expression; Identification

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v15i4.pp3928-3937

Copyright (c) 2025 Toni Arifin, Ignatius Wiseto Prasetyo Agung, Erfian Junianto, Dari Dianata Agustin, Ilham Rachmat Wibowo, Rizal Rachman

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES).