Automated segmentation and classification technique for brain stroke

Received Sep 2, 2018 Revised Dec 16, 2018 Accepted Jan 26, 2019 Difussion-Weighted Imaging (DWI) plays an important role in the diagnosis of brain stroke by providing detailed information regarding the soft tissue contrast in the brain organ. Conventionally, the differential diagnosis of brain stroke lesions is performed manually by professional neuroradiologists during a highly subjective and timeconsuming process. This study proposes a segmentation and classification technique to detect brain stroke lesions based on diffusion-weighted imaging (DWI). The type of stroke lesions consists of acute ischemic, sub-acute ischemic, chronic ischemic and acute hemorrhage. For segmentation, fuzzy c-Means (FCM) and active contour is proposed to segment the lesion’s region. FCM is implemented with active contour to separate the cerebral spinal fluid (CSF) with the hypointense lesion. Pre-processing is applied to the DWI for image normalization, background removal and image enhancement. The algorithm performance has been evaluated using Jaccard Index, Dice Coefficient (DC) and both false positive rate (FPR) and false negative rate (FNR). The average results for the Jaccard index, DC, FPR and FNR are 0.55, 0.68, 0.23 and 0.23, respectively. First statistical order method is applied to the segmentation result to obtain the features for the classifier input. For classification technique, bagged tree classifier is proposed to classify the type of stroke. The accuracy results for the classification is 90.8%. Based on the results, the proposed technique has potential to segment and classify brain stroke lesion from DWI images.


INTRODUCTION
Nowadays, medical imaging tool plays an important role in viewing the internal tissue of human brain. In brain stroke diagnosis, medical imaging such as MRI offers fast imaging tool with high resolution and nonionizing radiation [1]. The MRI imaging tool is able to detect 85% stroke survivor from ischemic stroke and 15-20% stroke survivor from hemorrhagic stroke [2]. DWI is one of the MRI modality that shows high sensitivity (88-100%) and specificity (95-100%) in detecting early ischemic stroke [3].
To fully utilize medical imaging, new algorithms and methods are urged in the medical imaging field for utilizing very large amounts of images wisely [4]. However, due to the large sample of images, the diagnosis process is time consuming and tiredness [5]. A computer aided diagnosis is needed for neuroradiologist to interpret the images more easily. Many researchers have studies machine learning technique in brain imaging [6]. Machine learning technique is believed to perform tasks and solve problems related to poor quality brain imaging samples by providing accurate representation and prior knowledge modeling. Thus, it plays an important role in brain imaging where it has become one of the most promising research areas in computer aided detection and diagnosis.

1833
Zhang Y. et al. [7] proposed kernel support vector machine to detect Alzheimer's disease (AD) and classify the disease between normal elder control (NC). Ramani R.G. et al. [8] proposed Naive Bayes, Support Vector Machine and random tree to analyze the machine learning technique in classifying normal and abnormal brain image. Mudali D. et al. [9] proposed decision tree method to interpret the diagnosis of neurodegenerative brain diseases. Meet Oza et al. [10] proposed random forest classifier to classify brain tumor into benign or malign cases. Fratello M. et al. [11] proposed random forest to classify two type of disease from individual lateral sclerosis (ALS) patients, Parkinson's disease (PD) patients and healthy control (HC) subjects.
In this paper, segmentation and classification technique for brain stroke using DWI image is proposed. The purpose of this study is to develop an automatic stroke segmentation and classification using DWI images. The proposed segmentation analysis is based on the fuzzy c-Means (FCM) and active contour technique. Active contour is integrated in the analysis framework to separate the cerebral spinal fluid (CSF) with the hypointense lesion and increase the segmentation accuracy. The performance of the segmentation techniques is evaluated based on Jaccardindex, Dice Coefficient (DC), false positive rate (FPR) and false negative rate (FNR). After the segmentation result is obtain, the result image is extracted using first order statistical method. These features are used for the bagged tree classifier input. The performance of the classifier technique is evaluated based on the accuracy (ACC).
This paper contains five section. Section 2 discussed the flow process of the proposed methodology in detail. Section 3 proposed the brain stroke analysis technique used to segment and classify the brain stroke lesion. Section 4 discussed the result from the segmentation result and Section 5 conclude the work presented.

RESEARCH METHOD 2.1. Proposed analysis framework
The flow process of this study starts with image preprocessing where image normalization, image enhancement and background removal are applied. After that, FCM is applied for image segmentation stage in order to extract the ROI of the stroke lesion. Then, the active contour is used to separate the CSF with the hypointense stroke lesion region. Hyperintense is referred to bright lesion such as acute ischemic stroke, acute hemorrhage stroke and sub-acute ischemic stroke lesion where hypointense is referred to dark lesion such as chronic ischemic stroke lesion. The segmentation result is extracted using first order statistical method for the input classifier. After the features are obtained, bagged tree classifier is used to classify the type of strokes lesion. Last but not least, the performance of the brain image is evaluated based on the Jaccard index, DC, FPR, FNR, ACC.

Imaging parameter
This paper focused on two brain stroke datasets from DWI images. Dataset 1 contains 61 samples from the General Hospital of Kuala Lumpur. Dataset 2 contains 69 samples from the online database of Ischemic Stroke Lesion Segmentation (ISLES). The dataset only focused on four types of brain stroke lesion which are acute ischemic stroke, chronic ischemic stroke, acute hemorrhage stroke and sub-acute ischemic stroke. The acute ischemic stroke, chronic ischemic stroke and acute hemorrhage stroke images were gained from the General Hospital of Kuala Lumpur (HKL). The sub-acute ischemic stroke images were gained from the public online Ischemic Stroke Lesion Segmentation (ISLES) challenge 2015 [12].

Image pre-processing
The pre-processing stage is the stage where the images need to undergo pre-processing stages to acquire better segmentation [13]- [15]. Three algorithms were applied to the DWI image which is image normalization, background removal and image enhancement [16]. These images are converted into the desired form in which the intensity is adjusted and the noise is removed.

Fuzzy c-means
FCM is a clustering method where it divides a group of pixels into homogenous cluster and assigns the pixels according to their category. It allows pixels' points to be assigned to multiple clusters and each pixel point has a degree of membership in a cluster to which it belongs. In this segmentation technique, the pixel point is select base on the center of three cluster which are lower, middle and higher cluster. Each data point of the cluster should equal to one. It is based on minimization of the following objective function: where m is any real number greater than 1, ij u is the degree of membership of i x in the cluster j, i x is the i th of d-dimensional measured data, j c is the d-dimension center of the cluster, and  is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership ij u and the cluster centers j c by: This iteration will stop when, where  is a termination criterion between 0 and 1, whereas k are the iteration steps.

Active contour
In DWI chronic stroke lesion image, the cerebrospinal fluid (CSF) share the similar intensity level with the stroke lesion. Due to this matter, FCM method has failed to segment the lesion since the algorithm in the cluster cannot be differentiated. To improve this performance, the CSF area is removed by using the active contour method.
Active contour is a method that create boundaries in an image. It uses computer generated curves to detect and locate object. This method is often using in medical images to find the boundaries of an organ in the images. The image is classified into two part which are object region and background region. In this method the region is represented as the inside and outside regions of the zero-level set.
Level set framework take two signs positive and negative to divide image domain. The image domain is separated into two disjoint region Ω1 and Ω2. Local intensity clustering property means that the intensities in the neighborhood 0 can be divided into N clusters, with center , = 1,2, … , . It can be written as: Where the cluster in the center of the i-th cluster is, is the membership function of the region, i.e. ( ) = 1 for ∈ Ω and ( ) = 0 for ∉ Ω . Or the corresponding equation can be witten as: Where the position of active contour is describe parametrically by v(s) = (x(s),y(s)), represents internal potential energy of the contour, is the energy that models external constraints impose onto the contour shape. G(y-x) is the Gaussian kernel applied as window function showed.

1835
Where is a constant, d is a distance between x and y points. is a standard deviation or scale parameter of Gaussian function, and is a radius of neighboring pixels. The radius of the neighborhood should be choose appropriately based on the degree of intensity in-homogeneity. The next step is contour construction to define initial shape around the object that will serve as an initialization set up. Last but not least, the greedy method is applied to simplify the implementation of the minimization of energy without having to perform an optimization algorithm method like gradient descent [17]- [21]. The function of this method is by finding for each point of the contour the closest local energy minimizing neighbor will converge to the overall global minimum of the contour. The method using 2 equations which are: Where ( )represent adjusts the elasticity of the active contour and ( )is adjusts the stiffness of the active contour.
Where is a real weighting value which for obvious reason would be positive, is a Gaussian weighted kernel of dimension n, I represent the input image, ∇is the spatial gradient function and ( ) is the contour function.

First order statistical method
First order statistical order method is applied at the ROI to obtain several features. Since this ROI depends on the signal of intensity, the ROI is classified into two parts of the image which are hyperintense and hypointense lesion. Mean, median and mode are used to separate the image between hyperintense and hypointense lesion. For hyperintense lesion, standard deviation is assigned while in hypointense lesion mean of region boundary is assigned. The standard deviation and mean of boundary are used to differentiate each characteristic of stroke lesion.
The calculation for the first order method is shows as below: Pearson mode skewness = | − | Where P(i) is the probability intensity level of the ROI, N be the total number of gray levels in the entire ROI.

Bagged tree classifier
Bagged tree classifier is a method that perform a collection of data by analyze the whole data than individually. This classifier builds and trains a variety of decision trees and produces classes from the mode of individual tree class. It applies general techniques by resampling data from the previous learning. Assume that For b= 1,..,B; where B is the bagging classifier (B times) that repeat the train set of = 1 , … , with the response = 1 , … , .Each individual tree class will observe the data from the previous data of tree class and make their own decision. The decision that make is then is selected and being compare with the other tree classes by average the data. For unseen data (x ' ), the data is select by using the mod of data in the tree class.
In this experiment, the bagged tree classifiers train 129 data for the number of branch node splits per tree. Figure 1 shows the bagged tree diagram.

Performance evaluation
The performance evaluationis calculated from the segmentation results with the neuroradiologist manual reference. Jaccard index, Dice coefficient (DC), false positive rate (FPR) and false negative rate (FNR) are used as the performance metrics [16]. From this calculation, the DWI segmentation result can be fully segment.
The performance verification for classification take part in the statistical calculation and shown in confusion matrix attributes using MATLAB APPS. Number of observations, true positive rate (TPR), false negative rate (FNR), positive predictive values (PPV) and false discovery rate (FDR) is used to identify the performance of the classifier. Below is the statistical calculation for performance verification.
Where true positive (TP) is the number of samples are correctly classified within their type of stroke in the positive sample (P). Positive predictive value (PPV) as shown in Equation (17) is calculated as the probability that correctly classify the type of stroke. The false positive (FP) is misclassified of stroke.
Lastly the performance of the classifier will be verified using Acc to as shown in Equation (18).
Where false positive (FP) is the number of samples are misclassified within their type of stroke in the positive sample. True negative (TN) is the number of samples are misclassified. In this ROC curve, TPR is a set of outputs greater or equal to the threshold, divided by a target value and FPR is a set of outputs less than the threshold, divided by the zero-target value. The AUC is a result of the overall quality of the stroke. Larger AUC indicates better performance for the classifier.
Where 1 is the probability of score for the type of stroke that is classify into its class and 0 is vice versa to 1 .

RESULTS AND ANALYSIS 3.1. FCM and active contour segmentation technique
The segmentation results of the original image from stroke lesion are showed in Figure 2. From the segmentation results, FCM can accurately segment the hyperintense lesion. The hyperintense lesion of DWI images shows high accuracy result compare to the hypointense lesion. Even though the result shows by the FCM toward hypointense lesion is low accuracy, the active contour technique can be used to separate the hypointense lesion with the CSF Table 2 shows the performance analysis and evaluation of the proposed FCM segmentation method for DWI images from dataset 1 and dataset 2. The performance of the algorithm is measured using the metrics such as Jaccard index, DC, FPR and FNR.
The result shows the proposed method offers very good segmentation result for sub-acute ischemic stroke according to high value of Jaccard index and DC with low value of FPR and FNR. The hypointense lesion is less accurate compare to hyperintense lesion due to the FCM technique in producing the ROI image. The segmentation also shows that the stroke lesion is over segmentation based on the result show in FPR. However, the segmentation proposed still can segment the ROI of brain stroke. The active contour has successfully separated the CSF area with the hypointense lesion.

Bagged tree classifier
The performance of the confusion matrices of the classification method is shown in Table 3. Table 3 shows the number of observations of each type of stroke lesion. Only all chronic ischemic stroke samples are correctly classified into its group.5 sample of acute ischemic is misclassified into sub-acute ischemic lesion. For acute hemorrhage stroke only 3 sample areclassified correctly. 4 other sample are misclassified into subacute ischemic others into acute ischemic. For sub-acute ischemic stroke 67 sample is classified correctly within their groups, 1 sample is misclassified into acute ischemic and others 1 is misclassified into acute hemorrhage. The hyperintense lesion is mostly misclassified into their classes due to the range of features of each class that shares the same features with other class.  Table 4 shows the performance value of TPR, FNR, PPV and FDR of each type of stroke. Chronic ischemic stroke shows better performance base on higher value in TPR and PPV with lower value of FNR and FDR. The proposed classification method shows the performance result for acute ischemic, chronic ischemic and sub-acute ischemic is high compare to the acute hemorrhage. The TPR result for acute stroke show low value because the sample of acute hemorrhage refer to sub-acute ischemic is higher. However, the PPV of acute hemorrhage is close to 1 since that the acute hemorrhage is correctly diagnosed. Figure 3 shows the ROC plot for each type of stroke. From the ROC plot, chronic stroke shows 100% on the TPR axis showing the stroke is correctly classified into its class. The positive type of each type of stroke for acute ischemic, chronic ischemic and acute hemorrhage is 2%, 1% and 15 % respectively. Based on the result, acute ischemic, chronic ischemic and acute hemorrhage shows good ROC results since the

CONCLUSION
In this study, the FCM technique has been implemented with active contour to segment the stroke lesions in DWI images. The segmentation of active contour is implemented with FCM to separate the CSF area with the hypointense lesion. The segmentation results are compared with the manual reference to verify the accuracy. According to the above results, FCM provides good segmentation results in hyperintense lesions especially to sub-acute ischemic lesion according to the high value of Jaccard index and DC with low value of FPR and FNR presented in segment the brain stroke. Overall average values of Jaccard index, DC, FPR and FNR are 0.58, 0.71, 0.12 and 0.31 respectively. Active contour technique shows that the brain stroke DWI image can segment the stroke lesion and separate the CSF wtith hypointense lesion well. Based on the features of the ROI segmentation result, each type of stroke is classified according to its class. Bagged tree classifier use ensemble method to predicted the type of each stroke lesion. All chronic stroke samples are correctly classified within it class. The accuracy show for this performance method is 90.8%.