Automatic BIRCH thresholding with features transformation for hierarchical breast cancer clustering

Ahmad Alzu'bi, Maysarah Barham


Breast cancer is one of the most common diseases diagnosed in women over the world. The balanced iterative reducing and clustering using hierarchies (BIRCH) has been widely used in many applications. However, clustering the patient records and selecting an optimal threshold for the hierarchical clusters still a challenging task. In addition, the existing BIRCH is sensitive to the order of data records and influenced by many numerical and functional parameters. Therefore, this paper proposes a unique BIRCH-based algorithm for breast cancer clustering. We aim at transforming the medical records using the breast screening features into sub-clusters to group the subject cases into malignant or benign clusters. The basic BIRCH clustering is firstly fed by a set of normalized features then we automate the threshold initialization to enhance the tree-based sub-clustering procedure. Additionally, we present a thorough analysis on the performance impact of tuning BIRCH with various relevant linkage functions and similarity measures. Two datasets of the standard breast cancer wisconsin (BCW) benchmarking collection are used to evaluate our algorithm. The experimental results show a clustering accuracy of 97.7% in 0.0004 seconds only, thereby confirming the efficiency of the proposed method in clustering the patient records and making timely decisions.


automatic thresholding; balanced iterative reducing and clustering using hierarchies; breast cancer; computer-aided diagnosis; hierarchical clustering;

Full Text:



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578