Feature selection of unbalanced breast cancer data using particle swarm optimization
Abstract
Breast cancer is one of the significant deaths causing diseases of women around the globe. Therefore, high accuracy in cancer prediction models is vital to improving patients’ treatment quality and survivability rate. In this work, we presented a new method namely improved balancing particle swarm optimization (IBPSO) algorithm to predict the stage of breast cancer using unbalanced surveillance epidemiology and end result (USEER) data. The work contributes in two directions. First, design and implement an improved particle swarm optimization (IPSO) algorithm to avoid the local minima while reducing USEER data’s dimensionality. The improvement comes primarily through employing the cross-over ability of the genetic algorithm as a fitness function while using the correlation-based function to guide the selection task to a minimal feature subset of USEER sufficiently to describe the universe. Second, develop an improved synthetic minority over-sampling technique (ISMOTE) that avoid overfitting problem while efficiently balance USEER. ISMOTE generates the new objects based on the average of the two objects with the smallest and largest distance from the centroid object of the minority class. The experiments and analysis show that the proposed IBPSO is feasible and effective, outperforms other state-of-the-art methods; in minimizing the features with an accuracy of 98.45%.
Keywords
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v12i5.pp4951-4959
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).