Breast cancer detection using ensemble of convolutional neural networks

ABSTRACT


INTRODUCTION
Breast cancer is the next major cause of cancer deaths in women after lung cancer [1].As per records of American Cancer Society 281,550 new cases of breast cancer were detected every year in women from United Stated [2].The current screening techniques led to a decline in the breast cancer death rate by 2.6%, even though there was an increase of breast cancer patients across the world.Detection of tumors is essential for choosing the best possible line of treatment.Small tumors in the breast might not produce any symptoms [3].Consequently, the early detection using screening plays a crucial role in medical evaluation.
Mammography has always been considered a gold standard for breast cancer screening [4].Mammography is a low-dose X-ray imaging, whose radiation may increase the risk in developing breast cancer [5].The manual process of detecting malignant masses is time consuming, tedious and has a large variability in the interpretation by radiologists.
Convolutional neural network (CNN) is a part of artificial intelligence which is gaining popularity since last decade.CNN had been proposed way back in 1993.Unfortunately these ventures remained in the experimental stage due to the unavailability of computational resources [6].Recently, CNN achieved improvement in its performance due to the availability of powerful graphical processing units (GPU).Shen et al. [7] have got promising results by training on database and digital database for screening mammography (DDSM) and transferring the training to INbreast database.Thulkar et al. [8] implemented rotation augmentation and vertical flipping to improve performance.Wang et al. [9] integrated the deep features extracted from CNN with shape features, density features and texture features to generate a fusion deep feature set.The authors used the support vector machine (SVM) classifier on fusion deep feature set to get an accuracy of 86.5%.
Ragab et al. [10] replaced the last fully connected layer of CNN with SVM to enhance the performance.The authors manually cropped the rectangular regions of interest (ROI) from the mammogram and were successful in achieving an accuracy of 73.6%.Zhao et al. [11] used active learning in mammographic image classification and achieved an accuracy of 90.5%.Bekker et al. [12] implemented a classification technique consisting of two neural networks followed by a single neural layer.The authors obtained an accuracy of 89% on DDSM dataset.Jadoon et al. [13] proposed a model in which CNN combined with discrete wavelet and curvelet is trained using SoftMax layer and SVM; thereby achieving an accuracy of 83.11%.Agnes et al. [14] developed a classification method by fusing the information using multiscale filters using CNN and obtained an accuracy of 83%.El-Kenawy et al. [15] have discussed that Ensemble learning may be used to improve the performance.The rest of the paper is as follows: Section 2 describes the proposed method.section 3 specifies the results and discussion.Section 4 summarizes the conclusion.

METHOD
In machine learning, the features must be handcrafted, whereas in deep learning, the need for manual feature extraction is eliminated.Usage of deep learning has increased exponentially as the deep learning models can be developed with minimum knowledge of the domain.Deep learning has been widely adopted for the detection and classification of lesions, since it has significant potential to learn the extremely distinguishing features as compared with machine learning.The issue of the large variability in interpretation is controlled using deep learning.Promising results have been achieved by deep CNN models in the detection of breast cancer as the models are able to extract the features which are not seen by humans.

Datasets
Our research has been done on two publicly available datasets namely mammographic image analysis society (MIAS) database and digital database for screening mammography (DDSM).These datasets were chosen for our research based on the survey done by Nahid and Kong [16].In their survey on Springer, Elsevier and IEEE websites, they concluded that researchers predominantly used MIAS and DDSM databases for breast image classification.

MIAS: mini mammographic database
Mammographic image analysis society (MIAS) is a United Kingdom research group, which is formed with an intension to figure out mammograms and to provide an organized database for research.MIAS database is popularly used by researchers as these images can be effortlessly acquired.MIAS provides a database with ground truth labeling for the researchers [17].Uniformity is maintained as each image is available with a size 1024×1024 pixels.

Digital database for screening mammography
The digital database for screening mammography (DDSM) is the largest online database of mammographic images maintained at the University of South Florida [17].A subset of DDSM is the curated breast imaging subset of DDSM referred as curated breast imaging subset of DDSM (CBIS-DDSM) containing images which are systematically labeled by a trained mammographer.There are around 2,500 medical records.Each record has images of both breasts in addition to patient details and corresponding anomaly.

Merged dataset
To increase robustness of the models, the images from the MIAS and CBIS-DDSM databases have been merged and applied as input.By merging the dataset, we imply that the images from the two diverse datasets have been included to form a new dataset.Using images from only one database would have trained the models only for the images of that particular type.Giving inputs from multiple sources, leads to training of parameters to build the intricate nonlinear relation between the inputs and outputs.The merging of databases was done such that equal number of images of each class were selected from each database, so as to avoid the problem of class imbalance.Hence the generalization of the system was increased.A total of 120 unique images were applied as input to the system.

Block diagram of proposed system
The schematic diagram of the approach of our proposed methodology is depicted in Figure 1.After preprocessing, the raw mammographic images from the merged dataset were applied as inputs to CNN models.The various CNN models were trained and tested.After evaluation, the most suitable CNN models were selected to form an Ensemble.The detailed description of our proposed methodology is as stated below.

Preprocessing
Median filter was employed to remove the noise present in raw mammographic images.To further enhance the image, contrast limited adaptive histogram equalization (CLAHE) was used.images from datasets were resized as per the requirements of the pretrained networks.In this research, Data augmentation was applied to overcome the drawback of small size of dataset which could have otherwise led to overfitting.Data augmentation was performed by Reflection and Translation.The original database of 120 images was increased 8 times by data augmentation, leading to a total of 960 images.Image datastore was employed to manage the collection of image files, where every image fits in memory at a time, but the entire collection may not necessarily fit at a time.
Figure 1.The schematic diagram of the approach of our proposed methodology

Individual CNN classifiers
The standard CNN architecture is made up of a sequence of layers to extract the distinguishing features.The important layers of CNN are the input layer, the convolutional layer, the pooling layer, the activation layer, and the fully connected layer.In the initial stage, the CNN models were used as feature DenseNet amongst multi-path-based group established maximal data flow within network layers.The CNN models from the width-based, feature-map-based, channel-boosting-based and attention-based groups were avoided as they led to heavy computational load.The CNN architecture from dimension-Based was avoided as it required high computational time.YOLO achieved enhanced accuracy with usage of many features.The CNN architectures were chosen to build the proposed model which was computationally light and would require less computational time when implemented using a single GPU.a. AlexNet Krizhevsky et al. [19] proposed a CNN architecture namely AlexNet containing five convolutional layers and three fully connected layers, which was the winner of ImageNet large scale visual recognition challenge 2012 (ImageNet ILSVRC challenge 2012).AlexNet has three max pooling layers and two normalization layers.The inputs needed to be resized to 227 X 227.b.VGG-19 Visual Geometry Group (VGG) developed the VGG-19 containing 19 layers.A kernel of size 3X3 and max pooling of 2X2 was proposed throughout the network, leading to high number of trained parameters.VGGNet has architectural simplicity but has high computational cost [20].c.DenseNet-201 DenseNet-201 reuses the features by connecting the output of every layer to the next layers after that layer.The preceding layer's feature maps and the feature maps of all the previous layers are applied as input for the next layer in the DenseNet in order to overcome the vanishing gradient issue [21].Low to mid-level features present the advantage to detect lesions of varied sizes .

d. Inception-V3
Inception-V3 has 48 layers.In Inception-V3, the generally used 7X7 convolution are factorized into three 3X3 convolutions, thus reducing the number of network parameters as compared with AlexNet [22].The input size for Inception-V3 is 299 X 299.Usage of Inception block permits kernels' size to be variable thereby increasing the efficiency.e. MatConvNet Building the model by using MatConvNet provides greater simplicity and flexibility [23].Easy-touse MATLAB functions enable realization of CNN building blocks.New CNN models can be quickly prototyped using MatConvNet.Training of complex architectures is possible because of MatConvNet environment.f.You only look once (YOLO) YOLO is a one-stage detector used in deep learning where object detection is treated as regression problem by developing a single network to find the class probabilities and bounding box locations simultaneously.This enables YOLO CNN arrive at a faster detection.Huang et al. [24] have implemented the YOLO architecture system using K-means clustering algorithm to determine the size of the bounding box.In our research, imageLabeler was launched for labeling the ground truth interactively.The labelling was done for rectangular regions of interest (ROI) for tumors.

Hyper parameters
Transfer learning is implemented in our research.In transfer learning, the initial layers which are pretrained are frozen and their weights are not changed.Only the additional layers are trained based upon the dataset.One of the most significant and vital tasks for creating a robust model is determining the ideal values of the hyperparameters.The optimal values of hyperparameters chosen and used in our experimentation are listed in Table 1

Ensemble learning
Ensemble classifiers combine the diversity of the individual learners in an optimum way [25].To improve our results, ensemble learning with three classifiers was employed.VGG-19 was excluded as it required very high training time.Accuracy, sensitivity, and specificity were chosen for selecting the classifiers.Selection of individual CNN models was based on Hungarian optimization.Weighted majority algorithm (WMA) is used in our proposed ensemble approach model to build a classifier from a pool of CNN models.The algorithm is based on the assumption that one or more of them will perform well.

RESULTS AND DISCUSSION
Experimentation has been done by applying the inputs from the merged database to various CNN models.The experimentation is repeated several times and the results that are obtained are averaged.The performance is evaluated by employing performance metrics, so that comparison can be done.In this section, the performance metrics achieved by our research are presented, followed by its comparison with other existing systems.

Performance metrics
The performance metrics used were accuracy, sensitivity, specificity and F1-score.Figure 2 shows the performance analysis results of the various architectures on the merged dataset, formed using MiniMIAS dataset and CBIS-DDSM dataset.Accuracy is the fraction of the correctly classified samples out of the total samples.Sensitivity is the ratio of the correctly classified malignant samples to the total malignant samples.Specificity is the ratio of correctly classified benign samples to the total benign samples.F1-score is a better metric as compared with accuracy when the classes are imbalanced.The training time required for individual CNN models is shown in Table 2.  From Figure 2, it is observed that the MatConvNet and YOLO architectures have achieved the highest accuracies of 94.2% and 93.3%, respectively.YOLO has attained an F1-score of 94%.The sensitivity and specificity obtained by YOLO architecture is 100% and 87% respectively.Hence it is observed that YOLO architecture has outperformed all other CNN models.The optimal assignment for Hungarian optimization is given in Table 3.The optimal value equals 2.912.Even though it was simple and quick, the Hungarian optimization produced the best assignment.The implementation of the Hungarian optimization led to the optimum selection of CNN models for the formation of ensemble.Thus, YOLO, MatConvNet, and DenseNet-201 were selected in our proposed ensemble architecture.The testing time required for the proposed system is 1 min 4 sec.
Table 4 provides the comparison between the results of our proposed system with the existing systems.The systems which worked with online datasets like DDSM, MIAS and INbreast datasets were taken into consideration along with a very few which used private data.It is observed that our proposed system has achieved improved performance over the existing mammography classification systems.

CONCLUSION
This paper proposes a new technique for performing classification on merged mammographic datasets.The experimentation has been done using the two publicly available datasets, namely MIAS database and DDSM.The best performing model, when employed alone, was MatConvNet, which achieved an accuracy level of 0.942.To improve the classification performance, ensemble of models has been proposed.The main contribution of our work is the usage of Hungarian optimization algorithm to select the individual CNN models for the ensemble.The ensemble of CNN models have resulted a remarkable increase in performance over existing systems.Combining YOLO, MatConvNet and DenseNet-201 architectures into an ensemble has achieved an accuracy of 0.957.


ISSN: 2088-8708 Int J Elec & Comp Eng, Vol.14, No. 1, February 2024: 1041-1047 1042 extractors and individual classifiers.The selection of CNN models was done based on the survey done by Bhatt et al. [18].The authors have categorized the CNN architectures into groups such as spatial exploitation models, depth-based models, multi-path-based, width-based, feature-map-based, channel-boosting-based, attention-based and dimension-based CNN models.The CNN architectures which have been used in our  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol.14, No. 1, February 2024: 1041-1047 1044 experimentations were AlexNet, VGG-19, DenseNet-201, Inception-V3, MatConvNet, and YOLO, which belonged to various groups and had unique advantages.AlexNet from spatial exploitation models group introduced regularization in CNN architecture.VGG from spatial exploitation models group made use of homogeneous topology.Inception-V3 of depth-based models category encorporated nonlinear mappings.

Figure 2 .
Figure 2. The Performance of individual CNN models & Comp Eng, Vol.14, No. 1, February 2024: 1041-1047 1046 Experimentation has been done in MATLAB R2022a.The CNN models have been implemented using the deep learning toolbox.The National Biomedical Imaging Archive (NBIA) data retriever version 4.3 is installed to download the images from The Cancer Imaging Archive (TCIA).This research was conducted on Intel Core i7 processor with 8 GB RAM, 1 TB hard disk and Nvidia GeForce GTX 1650 Ti GPU running onWindows 11.

Table 1 .
. Hyperparameter values for training Breast cancer detection using ensemble of convolutional neural networks (Swati Nadkarni) 1045

Table 2 .
Training time of individual CNN models

Table 3 .
The optimal assignment in the original cost matrix

Table 4 .
Comparison of proposed method with existing mammography classification methods