Image processing and machine learning techniques used in computer-aided detection system for mammogram screening-A review

Received Aug 21, 2019 Revised Nov 11, 2019 Accepted Nov 25, 2019 This paper aims to review the previously developed Computer-aided detection (CAD) systems for mammogram screening because increasing death rate in women due to breast cancer is a global medical issue and it can be controlled only by early detection with regular screening. Till now mammography is the widely used breast imaging modality. CAD systems have been adopted by the radiologists to increase the accuracy of the breast cancer diagnosis by avoiding human errors and experience related issues. This study reveals that in spite of the higher accuracy obtained by the earlier proposed CAD systems for breast cancer diagnosis, they are not fully automated. Moreover, the false-positive mammogram screening cases are high in number and over-diagnosis of breast cancer exposes a patient towards harmful overtreatment for which a huge amount of money is being wasted. In addition, it is also reported that the mammogram screening result with and without CAD systems does not have noticeable difference, whereas the undetected cancer cases by CAD system are increasing. Thus, future research is required to improve the performance of CAD system for mammogram screening and make it completely automated.


INTRODUCTION
Breast cancer counts 1 in 4 among all cancer cases in women [1] and this itself expresses the severity of the disease. This disease not only raises concern for women, but it can happen to men also, although the number is limited [2]. Since the death rate is high due to breast cancer and early symptoms are rarely found, hence, regular screening is the only option to save a life. There are two ways of breast cancer detection, namely via imaging and clinical laboratory evaluation. Imaging diagnosis is hypothetical and it includes interpretation of different medical images by either radiologists or using computer-aided detection (CAD) systems. Whereas, laboratory tests involve nipple aspirate fluid (NAF) analysis, breast biopsy and genetic test. These biological tests are costly, invasive, risky, and can contribute to patients' discomfort during the procedure and hence, image screening is performed to find out the presence of carcinoma in breast tissues before an individual is referred for invasive means of biological diagnosis.
Detection of abnormal tissues in medical images is the signs on which non-invasive imaging diagnosis is based. There are several available methods for imaging of the breast, such as mammography, ultrasound, magnetic resonance imaging (MRI), computed tomography (CT), positron-emission tomography (PET) and microwave imaging as illustrated in Figure 1. A CAD system first reads a medical image before it sequentially performs pre-processing, segmentation, feature extraction, and classification activities [3] on that 2337 medical image to identify the normal and abnormal tissues. The CAD system must also be able to classify malignant tumors from abnormal cases. The working procedure of mammogram screening through a CAD system is depicted in Figure 2. CAD systems are of high preference for automatic image analysis to avoid misdiagnosis due to the involved radiologist's lack of experience. In addition, it was also expected to save money by avoiding double reading by radiologists while considering single reading by CAD system. Several researches were already done on different CAD system for breast cancer diagnosis. But report says [4] that the false-positive rate of mammographic screening has been increased substantially than past years which in turn raises the over-diagnosis rate for breast cancer. Moreover, it was also revealed [5] that the results obtained after screening the mammogram with and without CAD systems for both sensitivity and specificity are nearly similar and the non-accurate breast cancer diagnosis by a CAD system increases the false-negative cases [5]. Altogether, a huge amount of money is being wasted per year although the recent CAD systems are more sensitive towards breast cancer diagnosis. Therefore, further research is required to propose an improved CAD system for mammogram screening. The objective of this study is to review the past researches on proposed CAD system for mammogram screening to find the area of improvements for future research. This paper is divided in few sections to provide an introduction on the terminologies of breast cancer and then an extensive review was done on different types of breast imaging systems and stages of CAD system followed by a discussion. In the last section, this paper is concluded.

CONSTRUCTION ELEMENTS OF BREAST AND DIFFERENT TUMORS
Lobules, ducts and connecting tissues are the main constructing elements of the breast. Milk is produced in lobules, which are generally known as milk glands, and it is carried up to the nipple through ducts which are actually tiny tubes. Different fibrous and fatty tissues are responsible for the size and shape of the breast and keep other tissues in place. In most cases, cancer initiates either in ductal or lobular tissues of women's breast due to the uncontrollable growth of breast cells, which ultimately generates tumors or lumps [6]. Two types of breast tissues can be identified during diagnosis namely normal, and abnormal. Benign and malignant are two types of tumors among abnormal tissues. While the normal tissues do not possess any tumor, the presence of malignant cells differentiates benign from cancerous tumors [7]. Lump at any point of the breast is the main indication of breast cancer. Other symptoms such as swelling at any part of the breast, discharge from the nipple, redness of nipple and pain in the breast or nipple may also be accounted for breast cancer. The risk factors associated with breast cancer are breast density, age, personal history, family history, first menstrual cycle, pregnancy history, being overweight, and the habit of alcohol consumption. In addition, the use of combined hormone therapy, oral contraceptives and previous chest radiation exposure would also increase the chance of having breast cancer. However, the mechanism of these factors in the development of breast cancer remains unknown [6]. Calcifications and masses are identified as two types of breast tumors [3] as can be seen in Figures 3 (a) and (b). A mass is a space-occupying lesion with features such as location, density, and margin. Benign masses are generally round shaped with smoother and well-defined margins having low-density. The high-density masses of stellate or speculated shape with improper margins is usually found as malignant. Architectural distortion and bilateral asymmetry are other aspects of cancerous masses.
(a) (b) Figure 3. (a) Spiculated mass, (b) Microcalcifications as referred from MIAS dataset [8] Minute calcium depositions in the breast can be seen as tiny bright spots in mammogram and they are known as Calcifications. Depending on size, it is classified as macro and microcalcifications. The main concern is with the latter one as the probability of malignancy is high. Around 0.3 mm is the size of microcalcifications in general and its possession of mass is not necessary. Benign calcifications are usually identical, large in size (diameter around 1-4 mm), coarse, round or oval-shaped, and dispersed or diffused. Microscopic, stellate-shaped, clustered in branches, innumerable (more than 5 in numbers) microcalcifications of different size and shape are found to be as malignant [3].

MEDICAL IMAGES USED FOR BREAST CANCER
All Medical images contain information of the human body and their composition or characteristics. They are formed by the signals due to their different penetration level through the tissues or by the re-emission of energy from the tissues, wherein these signals may not be of the same type. The information depicted by an image varies due to the changing contrast between different types of tissues. The target location of these images may be inside the body, even several centimeters below the accessible surface. Electromagnetic signals of frequency ranging between few hertz to exahertz have the capabilities of penetration and accordingly, they are used in medical imaging systems. Two key objectives are mainly considered in the previous studies in developing these imaging modalities; they are location specificity and lesion detection. Different techniques for breast imaging are discussed below.
Mammogram is a special type of X-ray for breast tissues. Lower dose X-ray of frequencies ranging from 30 petahertz to 30 exahertz is utilized to obtain two or three dimensional (2D or 3D) mammography images [9]. Film and digital are two types of mammograms. Film mammography was considered as a powerful tool for breast cancer screening from a longer time [10]. But it has drawbacks, such as lower sensitivity towards the dense breast, limited contrast characteristics, longer processing time and grain effect. The contrast can be manipulated in digital mammography and thus, presence of the lesion can be visible. Moreover, the processing time is less and better sensitivity can be obtained for dense breasts in digital mammography. Another limitation of mammography is that the patients are exposed to X-ray ionizing radiation.

2339
Sound waves ranging from 2-20 MHz [9] are used in ultrasound imaging to produce the images of a single plane. This technique provides a better result for lesion detection in dense breast and can be used in real time. On top of that, tissue elasticity can also be determined as was elaborated in [10] for classification purposes. Nevertheless, it mostly depends on the expertise of the operator since real-time tuning of gain, pressure, focal zones, patient positioning, dynamic range are required along with the recognition of peculiarity of the lesion.
Magnetic Resonance Imaging (MRI) system is built with RF coils along with a big size magnet (3)(4)(5). An intravenous injection of gadolinium is given to the patients before capturing 3D images through MRI. It can detect minute abnormalities of breast tissues and also the ductal carcinoma in situ in the dense breast along with its spread to the chest wall [10], this is largely because it has better temporal and spatial resolution [9]. Nonetheless, MRI cannot be used for those with a medical history of kidney disease as the injection can cause nephrogenic systemic fibrosis [9]. Moreover, the patients with a pacemaker and any metal implant are also not suitable for MRI due to its magnetic effect. Additionally, it is time-consuming and generates blur images [9]. Therefore, incorrect reading of MRI image may require a patient to go through the same process for several times.
Computed Tomography (CT) uses high dose x-ray radiation to generate the detailed scans or images of inside body. In most of the cases, CT machines generate continuous pictures in a helical (or spiral) fashion rather than producing a series of pictures of individual slices of the body. Helical CT has several advantages such as it is fast, it produces better 3-D images and it has better sensitivity in the detection of small abnormalities [11]. The newest CT scanners, called multislice CT or multidetector CT scanners, allow more slices to be imaged in a shorter period of time. Sometimes, contrast agent like iodine and barium are injected into the blood or given by mouth or enema as a way to do the CT scan. However, its high exposure to relatively large amounts of ionizing radiation than standard x-ray procedure makes it least favorable as a regular screening method.
In Positron Emission Tomography (PET) imaging system, a radioactive substance is injected into the blood to identify the most active body cells, especially the cancerous tissues. PET scan can be added with computed tomography (CT) so that both anatomical and functional views of the suspected cells can be observed. PET is not restricted to breast density and is useful in identifying axillary nodes and distant metastates [10]. However, it has poor sensitivity in detecting small tumors because of their small size.
The wavelengths ranging from a millimeter to a meter can penetrate many optically opaque mediums like living tissues based on the presence of ionized molecules due to a variety of dissolved substances, such as sugar, and the permittivity of any tissue is strongly dependent on its water content [12]. This theory is utilized in microwave imaging either by using a contrast agent or by utilizing radar [13] and this technique is quite new to biomedical engineering. Microwave signals scatter significantly from malignant breast tissues due to their water content and these scattered signals are captured in microwave imaging system [14]. Time requirement is considerably less in microwave imaging, but the heavy computational load is the main drawback of this system [13].
Ultrasound imaging and MRI are used along with mammography to increase the screening specificity [9]. Other than the mammography, ultrasound, and microwave imaging, the rest of the imaging systems discussed above are costly for regular screening. It is also observed that the use of microwave imaging as a regular screening tool is still subjected to further study and also under trial. Whereas, the accuracy of ultrasound imaging is fully dependent on the expertise of the operator. Moreover, despite all its limitations, to date mammography is the widely accepted imaging method. Radiation issue can be controlled by increasing the gap between two consecutive screenings. Therefore, the following subsections are devoted only to digital mammogram mainly due to its easy availability, ease of image manipulation, and fast screening time.

STAGES OF CAD SYSTEMS
Each stage of a CAD system has some objective to finally achieve the obtained result and those can be obtained by applying different techniques, such as cropping, noise removal, and enhancement are done during the pre-processing stage. Likewise, image segmentation is significant in segregating the image background along with identification and partitioning of the area of interest (AOI) because different breast tissues have different resolutions. Different stages and various methods to perform the activity of that stage are shown in Figure 4.

Pre-processing
Noise, uneven illumination and low contrast are the main drawbacks of the mammogram and thus, AOI identification and feature extraction are tough in this case. To negate the effects of these defects, cropping, de-noising, and enhancement of images are performed at the pre-processing stage before performing segmentation and feature extraction. The unwanted labels, artifacts and the image portion without information can be removed by cropping. During the acquisition of a digital image, noises which include readout and shot noise may be present. Several types of noises and all possible de-noising methods were discussed in earlier work [15]. De-noising of an image not only removes the noise but also smoothen the signals. Based on the histogram of an image, the enhancement procedure improves the contrast level of an image and hence, the features are more identifiable.
Detection of masses is far complicated than that for microcalcifications as the traits of masses are hard to perceive and sometimes they appear like normal breast tissues [16]. Since the microcalcifications have higher contrast than the rest of the region, and they are corresponded to high-frequency components, they may be easily detected through image enhancement and de-noising as it was done in [17] by using dyadic wavelet processing. Meanwhile, masses have low contrast, varying densities, spiculated structures, and have low-frequency components. The implementation of Contrast Limited Adaptive Histogram equalization (CLAHE) along with Median filtering provided the sensitivity and spicificity of 96.2% and 94.4%, respectively, for the detection of masses [18].

Segmentation
The removal of image background and the selection of AOI are the vital tasks in image processing, as is required in the segmentation stage. The common procedures used in image segmentation include thresholding, boundary-based segmentation, region-based segmentation, and template matching as illustrated in Figure 5.

Thresholding
This is a very common method to partition an image where the image background that does not carry any essential information, is removed. Based on the gray level histogram, the threshold value is selected and the difference between the useful and background image pixel intensities segments the image [19]. It is a fast and simple method to implement but does not guarantee object coherency for which post-processing may be required by some other operators. When only one threshold, T, is set on the basis of the entire image x(i, j) then, it is called global. If an image is segmented in sub-regions and T is selected for each sub-region depending on both and some local image property L(i, j), then it is known as a local threshold. Thresholding is classified as bi-level and multilevel thresholding; it can be expressed as T = [T1, T2, …TN] so that all pixels, where k = 0, 1, …N. Therefore, (N + 1) sub-regions will be generated. An image is divided into two parts, namely the useful region, which is denoted by white, and the background is reflected by black in bi-level thresholding. Multilevel thresholding is required for images with different surface characteristics [20]. The maximum entropy method, the minimum error method, and Otsu's method are among the classic thresholding techniques [21]. Otsu's thresholding is sensitive towards salt and pepper noise and hence, before its application, de-noising is required to smooth the image. Researchers in [22] used thresholding to segment a mammogram at multiple levels and a set of features was computed from each of the segmented regions.
Their study achieved 80% sensitivity with an average rate of 0.32 false-positives per image. Another study [23] proposed a probabilistic adaptive thresholding technique based on texture information and its probability to obtain the most feasible threshold values for specific parts of the mammogram. In this adaptive thresholding method, the threshold values were neither calculated using histogram nor by the shape of the region. This was done to eradicate the issues related to non-uniform intensities in the background region of a mammogram for which global threshold-based methods may fail. In [24], three classes threshold method along with edge detection algorithm was implemented for segmentation. Hybrid image segmentation along with Otsu's thresholding was used in [25] for accurate detection of a breast tumor, and its size. Thresholding is simple to implement even in real-time applications. It is fast and computationally inexpensive. Moreover, no prior information about the image is required. Nonetheless, its performance is poor for noisy image and also for images having no peak or broad, or flat valleys. The main drawback of thresholding is that it ignores spatial data of an image and thus, it fails to inform about the contiguousness of the segmented areas. Furthermore, only correct threshold selection can avoid the under or over segmentation. Thresholding along with other method can provide a better output as can be found from the works in [22][23][24][25].

Boundary-based segmentation
In this method, boundary or contour or edge of AOI is outlined to identify discontinuities or abrupt changes in a gray level image. There is no golden rule to determine the edge. It solely depends on the choice of the application. High pass filter and gradient filters such as Roberts, Prewitt, Sobel, and Canny are the basic techniques of edge detection. Nevertheless, edge detection based on the first order derivatives is not robust. They are highly sensitive to noise and a threshold is required. Meanwhile, detection based on the second order derivatives is able to locate the edge at zero-crossing; it is also more robust, less sensitive to noise and does not require the use of threshold in post-processing. The operator's size and computational complexity are proportional to each other in this method and it also ignores the spatial information of an image. An algorithm was proposed in [26] to enhance the mammogram before passed it onto Radial Speculation Filter for detection of the spiculated lesion. Butterworth high-pass filter along with Sobel edge detection operator was used in [27] and the experimental result was considerably effective. In [28] Sobel edge detection was implemented for initial contour estimation. Non-linear Polynomial Filtering was employed in [29] to enhance the edges and sharpen the lesions in mammograms so that the dependencies on pre-selected thresholds may be minimized.

Region-based segmentation
Different regions of similar features like gray level, color, texture, are identified in an image by region based segmentation. This is known as Region Growing or Splitting mechanism. In this process, AOI is selected through a predefined condition based on the previously obtained result by the intensity or edge details of the image so that tumor regions can be identified. However, this method needs additional operations such as uniform blocking, merge and split etc. [30] that shall be performed before the application of this method. In addition, its requirement for the manual depiction of an initial point makes it disadvantageous [19]. A study in [31] used this method for segmenting out the pectoral tissues from the mammogram and it was further used for classification. Mean Based Region Growing Segmentation (MRGS) was implemented in [32]. Researchers applied an automated region growing segmentation technique in [33] where the threshold was obtained from a trained Artificial Neural Network (ANN). In both [34,35] works, automatic seed selection was done before the use of region growing method.
Region-based segmentation is flexible in choosing between interactive and automatic method. An identifiable object boundary is generated due to the flow from an inner point to the outer region. The output of this method is better than any other segmentation procedure when an appropriate seed is selected. Conversely, noisy seed selection may lead to faulty segmented area. By nature, it is sequential and does not have significant effect on minute regions. The main limitations of this method are stopping criteria, higher computation time and memory.

Template matching
Detection of an object's presence in an image is an important task. This problem can be resolved with a priori knowledge of the detected object or template, which may be used to identify its location in a given scene. Therefore, if there is no prior knowledge of any tumor, it is difficult to utilize this technique and this is the main drawback of this technique. Researchers in [36] used Sech template to identify the suspicious areas and optimize them with thresholding. Template-matching technique was also used in [37] along with a local cost function and dynamic programming to optimize the contour.

Feature extraction and selection
An image feature may include color, shape, and texture. The contour-based and region-based representations are two types of techniques to provide shape features. The first method depends on the boundary information to provide the shape feature, but despite this limitation, it is more popular among researchers than the latter method that provides the shape features based on the complete region [38]. The texture features are geometric or structural, statistical, model-based, and transform-based and they were widely used in several earlier researches. Structural features are dependent on a set of primitives or patterns such as blobs, and edges and also on their spatial arrangement in hierarchy. But in most of the cases, this method provided unacceptable results for biological images due to their homogeneous spatial arrangements [39]. Statistical features are the spatial distribution of intensity values of the pixels and they can be of first order (e.g. mean, variance, standard deviation, skewness, kurtosis, and entropy) and second order. When first order provides information about particular pixel and its associated intensity, the second order such as Gray Level Co-occurrence Matrix (GLCM) reveals the relation in terms of contrast, correlation, energy, and homogeneity between particular pair of pixels having specified distance and angle. The first order statistical features are simple and of low computational cost. Second order statistical features provide better result despite the fact that the increasing statistical order raises the computational cost exponentially. Nonetheless, the efficiency and accuracy of the results are dependent on the selected distance and angels between pixels in case of GLCM and in [40], researchers efficiently calculated the geometric and texture related measures using GLCM. Local Binary Pattern (LBP) is a technique combining structural and statistical texture analysis methods. It reveals the intensity relations between a pixel and its neighbor through binary pattern. Although it is robust, its computational cost is expensive especially when the number of features considered is high. A fusion method was implemented in [41] combining the Completed LBP (CLBP) and Curvelet sub-band features and an accuracy of 96.68% was achieved with a reduced number of false positive in comparison with the experiment based on only CLBP features. Autoregressive models, Random Fields (e.g. Markov Random Fields) and Fractals are the Model-based methods for texture analysis in which a priori model is considered as a texture descriptor. While Random Fields methods suffer huge computational burden, Fractal procedures gain attraction due to their ability to find spatial complexity at different scales making it easier to find out the architectural distortion. In [42], fractal dimension was used to identify different textural patterns in the breast region, and the obtained classification result was satisfactory using the non-automated procedure. Transform-based texture analysis through spatial domain filters, Frequency domain filters, Gabor and Wavelet transform methods divides an image into different spaces to extract the features. Spatial domain filters (e.g., Robert and Sobel) are extensively used in detecting the edges, but their output in case of irregular texture is poor. Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT) are able to analyze the spatial frequency of an image but both approach lack the spatial localization. Therefore, Gabor or Wavelet transform is advantageous for its ability to identify the spatial location. Even though wavelet is not translation invariant, this can be overcome with curvelet analysis [3]. Besides all these conventional procedures, researchers proposed a new feature extraction method namely Square Centroid Lines Gray Level Distribution method (SCLGM) and Run Difference Method (RDM) in [43]. Discrete Wavelet Transform (DWT) and Spherical Wavelet Transform (SWT) were used to extract texture features from the images in [44]. According to the researchers [45], statistical properties of curvelet coefficients can be used in future works to improve the classification accuracy. The use of different feature extraction methods may be better than using the curvelet coefficients. In recent years, researchers are concentrating on the study of complete breast parenchyma for extracting the texture features incorporating lattice-based strategy to identify the heterogeneity of breast tissues as was done in [46] and the huge pool of features were reduced using Convolutional Neural Network (CNN).
It must be mentioned that the presence of redundant and irrelevant features may significantly degrade the precision. If the features are not properly selected, it may also reduce the learning speed of the appointed algorithm [47]. Therefore, the accuracy of classification depends largely on feature selection from a large set of data, especially in the case of artificial intelligence. Several algorithms found their uses in earlier researches on CAD system for mammogram analysis; among which Genetic Algorithm (GA) appeared promising because it works in a vast solution space with high dimensional features. This technique can minimize the redundancy and achieve better accuracy. It is a population-based metaheuristic search or optimization technique inspired by Darwin's evolution theory [48] and its performance extensively depends on its control parameters such as population size, crossover rate, and mutation probability. Therefore, these parameters must be selected properly to avoid any unsatisfactory result. The researchers in [49] proposed a computational technique for detection and segregation of AOI in mammogram using GA and multi-resolution technique that offered relatively high accuracy result. They proposed transform functions for specified advantages like phase information, high directionality, and shift insensibility [49].

Classification
Classification is the last stage of image analysis to distinguish firstly, the normal and abnormal tissues and secondly, to segregate the benign and malignant tumors from abnormal cases. This is viable with pattern recognition [19]. Selected features can be classified either by supervised or by unsupervised method. In the supervised method, it is required to train the system first and then the rest of the data can be tested by the trained system. However, an unsupervised method is dependent on machine learning to describe the hidden structure of unlabeled data.
A feature space is the whole range of a defined function of an image. The classifier is a supervised method to divide a feature space that is done by using labeled data [50] for training purposes to segment new set of data automatically. The functions, which are already defined in feature space, are responsible to divide this feature space further into several regions [19]. Classifiers are computationally fast and can be implemented in multichannel images [50]. There are several methods to train a classifier namely Parzen window, nearest neighbor, k-nearest-neighbor, maximum likelihood/Bayes classifier, and decision tree. Parzen window and k-nearest-neighbor (KNN) classifiers provide no underlying assumption about the statistical structure of the data for which they are considered as the non-parametric classifiers. The maximum-likelihood/Bayes classifier is, however, a parametric classifier that considers pixel intensities as independent samples from a mixture of probability distributions. The computational burden of these methods is quite high, particularly with large data set.
Clustering is an unsupervised method to classify an image; this technique can be described as a classifier without using training data, but it needs initial parameters or segmentation process [50]. The selftraining is done by iteratively dividing an image through segmentation and train itself with the existing data. K-means, expectation maximization (EM) and Fuzzy c-means are considered as clustering methods. Since it does not require initial spatial modeling, it may be sensitive to intensity in homogeneities and noise. Clustering is mainly applied in segmenting MRI and in the cases where pixel intensity distributions are detached [19]. ANN is an information processing technique that is inspired by the way human brains process information. It is through a set of inter-connecting nodes, usually known as neurons, which deliver the output through a computer model. Each node is associated with gain or weight that can be adjusted to get the required output from the given input. Learning, and recall is the two working phases [3]. Weight adaptation of the nodes is done to train the ANN about the task during the learning phase either through supervised or unsupervised methods [19,50]. The recall is for validation and resolving an issue. Feed forward and back propagation are two ways of learning procedure. ANN can also select features for which the weights or gains of the nodes should be adjusted and trained accordingly. Feed forward Neural Network (FNN) was used as a classifier that was trained through Jaya algorithm in [51] and the obtained sensitivity and specificity were of 92.26% ± 3.44% and 92.28% ± 3.58% respectively. The main advantage of ANN is that it has parallel processing capability and can predict the output even with insufficient training data although the accuracy is dependent on large data set. Its computational cost highly depends on the hidden layers and connected neurons.
Several other classification techniques were tried on mammograms. The breast abnormalities of the mammogram were classified in [52] by incorporating a new pattern classifier approach through the Particle Swarm Optimized Wavelet Neural Network (PSOWNN) that was based on extracting Laws Texture Energy Measures. In an experiment [33], the researchers tried both the region growing method along with ANN and cellular neural network (CNN) for segmentation. Then GA was applied for feature selection and the classifications in both cases were done using various classifiers such as KNN, support vector machine (SVM), naïve Bayes, random forest, and multi-layer neural network (MLP). It was observed that MLP performed best in both cases. An evaluation was done in [53] on three unsupervised classifiers namely Optimum-Path Forest (OPF), Gaussian Mixture Model (GMM) and k-Means, and it was found that OPF outperformed the others.
A different approach was tried in [54] for automatic evaluation of different breast tissues. Here the researchers engaged with radiologists and clinical practitioners for their expert opinions on the previous predicted reports to segregate the qualitative mammographic features. The optimal decision threshold was calculated based on statistical analysis for benign and malignant cases and taking into account the shape and size of the tumors. These features were used as datasets for real training of different classifier architectures such as linear classifiers, neural networks (NN) and SVM and for optimal feature sets, in which up to 95% accuracy was obtained. It was concluded that specialized image processing algorithms along with powerful pattern recognition models of non-linear and highly adaptive architecture may provide a better result.

DISCUSSION
Different medical imaging systems for breast were studied in the beginning of this paper and they are summarized in Table 1 based on several criteria. It is found that the sensitivity of finding small tumors even in dense breast is high for MRI, Ultrasound and CT scan. However, CT scan cannot be considered for regular screening method as it increases the chances of cancer and the outcome of Ultrasound imaging depends on the expertise of the operator. On the other hand, MRI is costly and it has restricted use due to gadolinium and strong magnetic effect. So, although the sensitivity of digital mammography is moderate to detect tumors in dense breast, it is widely accepted throughout the world as regular screening method due to its low cost and minimum processing time.
An extensive study has been done in this paper on recent CAD system for mammogram screening and a brief summary is tabulated in Table 2 to highlight different technologies that were used in each stage. There are several researches on a particular stage of a CAD system, such as segmentation or feature extraction where either one technique was evaluated or different methods were compared and hence, they are not included in Table 2. The analysis of Table 2 reveals that the obtained accuracy in most of the researches is at higher side irrespective of the technologies used in each stage. However, none of the developed CAD system are fully automatic, except the work done in [33] and this is mostly because of the semi-automatic or manual segmentation techniques. Even, the work [45] that attained highest accuracy 98.59% during classification, also used manual cropping for segmentation. It can also be observed that at classification stage, machine learning and neural networks were implemented in all the works, but with different algorithms and classifiers. Therefore, the future research can implement unsupervised machine learning methods to segment the AOI automatically along with supervised algorithms to classify the image for improved performance of the CAD system.

CONCLUSION
The main purpose of this study is to review the past researches on proposed CAD systems for breast cancer diagnosis and it has been noticed that along with classic image processing methods, more importance is given on machine learning and artificial neural network based systems to make the system automated. The above discussion reveals that till date the acceptability and use of mammogram is high for regular screening considering all its limitations and all the stages of a CAD system are equally important in identifying several factors like image content, intensity and texture that contributes to achieve higher accuracy during classification. Each stage can be performed following several methods that are discussed elaborately in this study highlighting their pros and cons. However, neither a single technique is applicable to all types of images nor all the techniques perform well for one particular image. Furthermore, none of the segmentation procedure is fully automatic. So, machine learning based intelligent systems can help to make the complete procedure automated. Unsupervised method can be implemented during segmentation to identify the AOI and supervised method can improve the classification performance through the appropriate training of the system. Although the CAD system is adopted by radiologists to avoid their experience related errors, as well to reduce the double reading cost, however, report reveals that in reality there is not much identifiable difference in terms of sensitivity and specificity for mammography screening with and without CAD systems. Moreover, missed breast cancer cases by a CAD system put a threat to a life. Therefore, it can be interpreted that still there are rooms for improvement in developing a new CAD system for mammogram screening.