PSO-SVM hybrid system for melanoma detection from histo-pathological images

ABSTRACT


INTRODUCTION
Skin cancer is increasingly becoming more common globally. It is spreading in Australia at higher rates than other cancer types [1,2]. The deadliest form of skin cancer is Melanoma. Melanoma cases reported in Australia are the highest globally at almost four times the rates seen in Canada, the United Kingdom and the United States [3]. Reports have shown that melanoma has accounted for 75% of cancer deaths in Australia [4].
Early diagnosis of Melanoma, which is obviously dependent upon patient's attention and accurate assessment by a medical practitioner, is crucial. Studies have showed that melanoma can be cured at a rate of 95% if diagnosed and treated in early stages [5]. It can be removed by simple surgery if it has not entered the blood stream. Melanoma can be diagnosed visually in a noninvasive fashion but this can lead to inaccurate judgments as it is visually hard for medical professionals to differentiate normal from abnormal mole. It has been reported that the accuracy of specialized Dermatologist's centers is only about 60% [6].
A more reliable way for skin cancer diagnosis is based on the study of skin lesion pathology. Pathologists have, traditionally, used Histo-pathological images of biopsy samples removed from patients and made judgments based on the deviations in the cell structures and/or the changes in the distribution of the cells across the tissue under examination. However, these judgments can subjective as they depend on the experience of Pathologists and often lead to considerable variability [7,8]. In order to improve the reliability of skin cancer diagnosis, researchers have thought of developing computational tools for automated cancer diagnosis that operate on quantitative measures. Because of its promising benefits in reducing the number of fatality of skin cancer patients, automated cancer diagnosis has become an important research topic. Most of the work in that field is based on applying image processing and machine learning techniques on external skin melanoma images to diagnose whether they are benign or malignant. Some of the approaches work on analyzing external skin images [9][10][11].
Other approaches work on detecting skin cancer based on Histo-pathological images obtained from skin lesions [12,13] and they focus on quantifying biomarkers on a pixel-by-pixel basis or a regional basis [14,15]. These approaches are inspired by the way that Pathologists follow to diagnose skin cancer. They are more biologically relevant measure and can be more useful in medical diagnosis as they can provide a second opinion for the pathologist before taking the final decision.
An automated skin cancer detection method that is based on histo-pathological images of skin lesions was introduced in [13]. It aimed to enable the discrimination between melanocytic nevi and malignant melanoma. After filtering the image using spatially adaptive color median filter and applying K-means clustering for segmentation, image features were obtained from the histogram and the co-occurrence matrix. The extracted features were then reduced using sequential feature selection and then were fed to an SVM for training and testing. The dataset used for the evaluation of the algorithm was obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. It included 42 Histo-pathological images (28 benign and 14 melanoma). The dataset was divided into training set (60%), validation set (20%), and test set (20%). The proposed system achieved a classification accuracy of 88.9%, sensitivity of 87.5% and specificity of 100%. While the achieved` results were very good, the method had high computational time cost. The paper concluded that method should be further tested on larger dataset to verify the reliability its reliability.
Another approach for automated skin cancer detection from Histo-pathological of skin lesions images was introduced in [16]. Unlike [13], in the pre-processing stage, the image was passed through three filtering stages, namely, Wiener filter, adaptive median filter and Gabor filter to improve diagnostic accuracy. Histogram equalization was used to enhance the contrast of the images prior to segmentation. Segmentation was implemented through Edge Detection Operations. Image features were obtained from the histogram and the co-occurrence matrix. The extracted features were then reduced using sequential feature selection and then were fed to an SVM for training and testing. The algorithm was tested on the same dataset used in [13]. It achieved a classification accuracy of 81 %, a sensitivity of 76 % and a specificity of 100 % which are less than the results obtained in [13]. However, the algorithm was 17 times faster than that in [13].
In their efforts to address the same problem, the authors in [17] used additional features extracted from the Wavelet Packet Transformation (WPT) of the pre-processed Histo-pathological image along with the features obtained from the histogram and the co-occurrence matrix used in [16]. The paper introduced a PSO-SVM framework that enabled simultaneous feature selection and SVM parameters optimization. The evaluations were conducted on a real dataset different from that used in [13] and [16]. It includes 79 Histo-pathological images. The system achieved a classification accuracy of 87.13%, a sensitivity of 94.1% and a specificity of 80.22%.
In this study, we extend our work in [17] and provide a full explanation of the PSO-SVM novel melanoma detection strategy introduced there. It is based on a hybrid Particle Swarm Optimization-Support Vector Machine (PSO-SVM) framework that aims to enable improving image features selection and SVM parameters optimization simultaneously. We include detailed and fair comparisons with the work in [13] using the same dataset. We clearly show that the proposed work achieved results are benchmark with those achieved by GPs and Dermatologists to show superiority of our solutions. The evaluations are conducted on a real dataset obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. It includes 79 Histo-pathological images.
The paper is organized as follows: Sections 2 explains the proposed system and the stages followed to reach the decision. The evaluations of our proposed system are presented in Section 3 followed in Section 4 by conclusions and directions for future work.

RESEARCH METHOD
The block diagram in Figure 1 describes the proposed skin cancer diagnostic system. It consist of the following stages.

Pre-processing stage
Pre-Processing is meant to facilitate image segmentation by filtering the noise from the image and enhancing its important features [18]. In this work, Wiener filter is used as an optimal way for accounting for the noisy components to result in the best reconstruction of the original image. It is basically considered as a low pass filter to reduce constant power additive Gaussian noise. In addition, Gabor filter which is well suited for texture segmentation problems [19][20][21] is employed on the unsegmented Histo-pathological images that require Cell and Texture properties analysis to improve the segmentation process. Finally, Median filtering which is a nonlinear operation is used to reduce noise and preserve edges [22,23].

Image enhancement
Image enhancement methods aim to improve the contrast and visibility of the image features that depend on the imaging modality as well as the anatomical regions [19,23,24]. In this work, Histogram Equalization is used for image enhancement. It is based on adjusting image intensities Histogram to enhance the image contrast. We use it in this work as it allows for areas of lower local contrast to gain a higher contrast which in turn results in making some important textural properties in Histo-pathological images more visible and improves the feature extraction process.

Image segmentation
In this study Sobel operator based Edge Detection is used for the purpose of segmenting images. The Sobel operator is an example of the gradient edge detection methods. It is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function [25,26].
The gradient magnitude and directional information are found as follows: The image is convolved with the Sobel horizontal and vertical directions masks (operators) to result in the horizontal and vertical derivative (gradient) approximations Gx and Gy respectively. The gradient magnitude M is taken as the absolute sum of values of the horizontal and vertical gradient approximations [27,28]. The point that has high value of M will appear as an edge point in the resulting image [25].

Feature extraction
Similar to [16], for each Histo-pathological image, six features are extracted from the grayscale image histogram (Mean, Variance, Skewness, Kurtosis, Energy, and Entropy) and twenty one features are extracted from the co-occurrence. In this paper, we use Wavelet Packet Transform (WPT) of the image to provide us with a set of additional features. WPT is a generalized version of the Wavelet Transform in which the high-frequency part is also split into a low and a high frequency parts and so on [29]. This produces a decomposition tree. We work down to 7 decomposition levels of WPT resulting in 255 components. The features are generated by taking the energy of the wavelet coefficients in the Subband [30] as seen in (1). This results in 255 features. Therefore, our full feature vector will include 282 features.
where the is WPT of signal , is the Subband frequency index and is the number of Wavelet coefficients in the lth Subband. WPT provides a high dimensional feature vector thus providing more information about the images [30]. The variability in the texture of a Histo-pathological image appears to be what most separates malignant melanoma from benign nevi, therefore the best approach at feature extraction level would be to retain as much of the data variability as possible [31]. This is achieved by WPT. Wavelet packet analysis looks at these changes over different scales which should describe the whole lesion properties such as texture, color, and local changes like granularity.

Feature selection
Since the number of features extracted from the image is high and to choose the most relevant features that would improve the performance of the classifier, it is important to use an optimal feature selection method. Some of the popular features selection methods are Sequential Forward Selection (SFS) [16], Sequential Backward Selection (SBS) [32], Genetic algorithms [33] and Particle Swarm Optimization (PSO). In addition to feature selection [34,35], PSO has been used for the optimization SVM parameters [36], radial basis function extreme learning machine parameters [37], and for both feature selection and parameter optimization [38][39][40][41][42].
PSO is a population-based stochastic optimization technique that mimics the movement of swarms and is inspired by social behavior of birds or fishes. It works by having a population (called a swarm) of candidate solutions (called particles). Each particle is moved around in the search-space according to (2) and (3) guided by its own best known position in the search-space as well as the entire swarm's best known position and its own velocity (bounded with a maximum value ). The process is iterative. Each iteration , , and are updated. Therefore, when improved positions are being discovered these will then come to guide the movements of the swarm.
Where k the current generation (iteration), 1 and 2 are personal and social learning factors and are taken here as positive constants, 1 and 2 are random numbers from the interval [0,1]. In [16], features were selected using the SFS method. The use of SFS resulted in a considerably improved classification rate. In this study we use PSO to select the most relevant features and optimize the classifier parameters (SVM) through the use of a PSO-SVM algorithm as type of learning for the classifier [42]. The steps of the proposed system are discussed below. They are also summarized in the flowchart of Figure 2. PSO-SVM system for optimal selection of feature and SVM parameters: 1) Initialization Each particle is defined as an array with two cells for the SVM parameters (C and γ) and 282 cells corresponding for each of the 282 features. The feature cells contain weights in range of 0 to 1 which reflect the significance of corresponding features. In first population, PSO first particle is initialized using SFS while the other particles are initialized by random numbers ranging from 0 to 1.

2) Selection of features in a particle
In each iteration, features with weights more than a specified threshold (here 0.5) are chosen. All the particles are then sent to SVM for calculating their accuracy based on the chosen features.

3) Fitness function
To compute the fitness (accuracy) corresponding to each particle; a SVM is loaded with the selected features of the particle together with the C and γ parameters. Cross validation is used to generate training and validation sets. The SVM is trained on the training set. The performance of the SVM is then verified on the validation set and the corresponding feature set is considered as the fitness of that particle. This done for every particle.
After evaluating the accuracy of particles, the best accuracy in a population is considered as and the best accuracy in the history of each particle is corresponding to that particle. The particles in next populations are generated according to Equations (2) and (3). Clearly, in the first population and will be same. The process is stopped once the maximum number of iteration is reachecd. The particle corresponding to will then contain the best features and the best SVM parameters. This is used to simultaneously optimize the SVM parameters (C and γ) and select the most relevant features.

Classification
The classifier used in this work is the SVM classifier that is widely used due to its high classification accuracy and ability to deal with high-dimensional data [32,43]. After the optimal SVM parameters have been selected in the previous step, SVM is then used to classify the histopathology images as malignant or benign.

RESULTS AND DISCUSSION
The introduced system ability to decide whether a skin lesion Histo-pathological image is benign or malignant is evaluated in this section using a dataset obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. It includes 79 Histo-pathological images (29 benign images and 50 melanoma images).
Two experiments were conducted. They differ from each other in the feature selection method employed and the way the optimal SVM parameters were picked (C and γ). The classifiers used in all of the experiments were SVMs with Radial Basis Function Kernel [44] implemented using LIBSVM toolbox for MatLab [45]. All the algorithms were implemented using MATLAB R2013b. 60% of the images were used for training, 20 % for validation and the remaining 20% were used for the purpose of testing.
Following the image class labeling of [46], the images that were confirmed by the pathologist as melanoma images were considered as negative class images, while the nevus ones were treated as positive where TP is the number of true positives, TN is the number of true negatives, FN is the number of false negatives, and FP is the number of false positives.
In the first experiment, SFS was used to choose the most relevant features that result in the best performance of the SVM. 5-fold cross validation was used to pick the best RBF kernel parameters (C=10 and γ=0.125). The results of experiment 1 are shown in Table 1. The method is denoted in Table 1 by WPT-SFS-SVM indicating that WPT features, SVM classifier and SFS feature selection method are used by this method. PSO-SVM arrangement was used in the second experiment to choose the most relevant features and the optimal RBF kernel parameters that optimize the performance of the SVM. 5-fold cross validation was used within the PSO-SVM arrangement. The resulting RBF kernel parameters are C=3525.0051 and γ=0.0084732). The results of experiment 2 are shown in Table 2. The method is denoted in Table 2 by WPT-PSO-SVM. Comparing the results in Table 1 with those in Table 2, it can be seen the performance of the PSO-SVM arrangement for feature selection and SVM parameter optimization is considerably better than that obtained when just using SFS for feature selection. An improvement of 10 % can be noticed in the values of the Sensitivity, the Specificity and the Accuracy. This highlights the success of the WPT-PSO-SVM system.
The presented WPT-PSO-SVM system resulted in a sensitivity of 94.1 % a specificity of 80.2% and an accuracy of 87.1 %. The obtained sensitivity and specificity results are comparable to those obtained by Dermatologists and considerably better than those obtained by less trained doctors as seen in Table 3 (quoted from [46]). Consequently, the proposed system can be considered as a promising method to be used by pathologists for skin cancer diagnostic. For the sake of fair comparision between the WPT-PSO-SVM framework and the method of [13] (denoted her as SFS-SVM), which showed impressive results on a smaller dataset (42 imges), we have tested the SFS-SVM method on the same dataset used here (79 images) as reported in Table 4. The SFS-SVM method used in [13] implemented SFS on features obtained from the histogram and the co-occurrence matrix only. Comparing the results of  Takruri) 2947 that although both methods use the same feature selection method (SFS), the use of additional features extracted from the Wavelet Packet Transformation (WPT) for the case of WPT-SFS-SVM has resulted in a better recognition accuracy, sensitivity and specificity as compared with SFS-SVM. On the other hand, the results displayed in Table 2, which uses WPT-PSO-SVM, are clearly much better than those in Table 4 (SFS-SVM) in terms of accuracy, sensitivity and specificity which emphasizes the advantages obtained from using the PSO-SVM framework for feature selection and SVM parameter optimization.  Figure 3 shows the average testing accuracy over 1000 runs of the three methods in the form of a bar graph with the confidence intervals indicated. It can be seen that in addition to the improved accuracy achived by WPT-PSO-SVM, it provides more stable accuracy performance as it has a smaller error range as compared with WPT-PSO-SVM and SFS-SVM. In general it can be said that, for the same average accuracy, a classification system with narrower confidence interval can be considered as more reliable as its results will be closer to the expected accuracy.

CONCLUSION
This paper has proposed an automated system for skin cancer (melanoma) detection from Histopathological images sampled from microscopic slides of skin biopsy. It is a hybrid system based on Particle Swarm Optimization and Support Vector Machine (PSO-SVM). The features used are extracted from the grayscale image histogram, the co-occurrence matrix and the energy of the wavelet coefficients resulting from the wavelet packet decomposition of the image. The PSO-SVM system selects the best feature set and the best values for the SVM parameters (C and γ) that optimize the performance of the SVM classifier.
Evaluations have been made on a dataset obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. It includes 79 Histo-pathological images (29 benign images and 50 melanoma images). The recognition accuracy obtained by the PSO-SVM system is 87.7.1% whereas the sensitivity and specificity are 94.1% and 80.2%, respectively. The obtained results shows that the PSO-SVM system outperforms other existing systems. The sensitivity and specificity results are comparable to those obtained by dermatologists and experts.
The proposed system, as a result, can be thought of as an encouraging method towards the automation and early detection of skin cancer. It can serve for medical practitioners, after further improvements, as a second opinion in the skin cancer diagnosis process. However, much more Histopathological skin images should be obtained from hospitals to be used for training and testing and result in more reliable system. In future, we intend to widen our database of carefully labelled Histo-pathological skin images and explore further different optimal feature selection methods. We also intend to test various classification techniques such Neuro-Fuzzy algorithms to improve the classification accuracy.