Compression of MRI brain images based on automatic extraction of tumor region

In the compression of medical images, region of interest (ROI) based techniques seem to be promising, as they can result in high compression ratios while maintaining the quality of region of diagnostic importance, the ROI, when image is reconstructed. In this article, we propose a set-up for compression of brain magnetic resonance imaging (MRI) images based on automatic extraction of tumor. Our approach is to first separate the tumor, the ROI in our case, from brain image, using support vector machine (SVM) classification and region extraction step. Then, tumor region (ROI) is compressed using Arithmetic coding, a lossless compression technique. The non-tumorous region, non-region of interest (NROI), is compressed using a lossy compression technique formed by a combination of discrete wavelet transform (DWT), set partitioning in hierarchical trees (SPIHT) and arithmetic coding (AC). The classification performance parameters, like, dice coefficient, sensitivity, positive predictive value and accuracy are tabulated. In the case of compression, we report, performance parameters like mean square error and peak signal to noise ratio for a given set of bits per pixel (bpp) values. We found that the compression scheme considered in our setup gives promising results as compared to other schemes.


INTRODUCTION
Hospitals and diagnostic centers produce enormous medical image data every year using various imaging techniques like, magnetic resonance imaging (MRI), computed tomography (CT) scan, and ultrasonography [1]. This has led to use of compression techniques for efficient storage of images in their picture archives for future use. Also, there is a growth in telemedicine field [2], wherein the doctor can access patient's information from a remote place. In the ongoing covid19 pandemic [3] kind of situation, telemedicine is the best mode of contacting doctors and getting suggestions from them. For transmission of patient's data through a network, the data needs to be compressed in order to reduce transmission time. Therefore, medical image compression plays a key role [4] in storage and transmission of patient's data, since most of data comprises radiological scans of a patient. In the case of medical images, hybrid compression techniques based on ROI would be advantageous [5]. The diagnostically important region, region of interest (ROI) in the image, can be compressed using a lossless technique and the remaining part, referred to as non-region of interest (NROI), can be compressed using lossy technique. This leads to high

Preprocessing and ROI extraction
The input to system is a brain MRI image of size 240x240 pixels, which is first fed to a preprocessing block. BraTS 2018 [24]- [26] dataset has MRI data obtained from various hospitals or scanning centers. Pixel intensity values of various images fall in varied ranges. Therefore, intensity normalization is performed to ensure that the input image has pixel intensity values in a common range, 0 to 255. This is done by setting the maximum pixel intensity value in the input image as 255 and all other low intensity pixel values are scaled by same proportion. Also, there is inherent noise in the image when it is acquired from MR scanners. The noise effect is minimized or nullified using anisotropic diffusion (AD) filter [27], [28]. It is a scale-space filter, characterized by diffusion equation depending on conductance function and image gradient. The performance of the filter depends on factors, like, gradient modulus threshold, conductance function and number of iterations, which are carefully selected [29]. This filter aims to blur out the noisy region, while preserving region boundaries in the image. The last step in preprocessing is the feature extraction, which involves computation of feature vector from neighborhood of each pixel of the input image. A feature vector has five elements representing mean, standard deviation, skewness, kurtosis and entropy [30]. These parameters are calculated from the neighborhood of every pixel in the input image. The neighborhood in this case is the area of 5x5 pixels, with the pixel in question at its center. Boundary extension is done for the image by adding two rows at top and bottom, and by adding two columns at left and right, which ensures proper neighborhood for the boundary pixels of the image. This leads to calculation of features corresponding to boundary pixels in a reasonable way. For an image of size 240x240, there are 57600 pixels and hence there are 57600 neighborhoods. For each pixel, from its neighborhood, five features are calculated, resulting in a five-component feature vector. All these feature vectors are arranged to form individual rows of a matrix which has size 57600x5, thus forming the output of preprocessing. The preprocessed data is applied to support vector machine (SVM) classifier and ROI extractor.
The SVM classifier is essentially a learned model which would have undergone training before it is incorporated in the compression system and using the same, tumor region/ROI in the input image is detected. The training is based on labelled dataset that leads SVM to come up with a decision hyperplane by maintaining maximized margin between feature vectors corresponding to different classes [12], [31]. In training phase, the training image is applied to preprocessing unit which results in a feature matrix (57600x5). The corresponding ground truth image is converted to a column vector, 57600x1. Here, every SVM operates with radial basis function (RBF) kernel with a scaling parameter of two. RBF helps in separation of tumorous and non-tumorous pixels with maximized margin in feature space. Once the training gets over based on minimization of squared magnitude of weight vector (leads to maximized margin between data points of two classes), the SVM model is stored and then used for classification of new images.
In the case of classification, the SVM trained model receives the preprocessed data, which are feature vectors corresponding to each pixel of the input image. Each support vector (found during training) is paired with input feature vector and the corresponding radial basis kernel function is evaluated. Radial basis functions evaluated are multiplied to Lagrange multipliers and assigned a sign depending on output corresponding to support vectors (If the support vector corresponds to a tumorous pixel, the sign is positive). Now, the sum of products related to support vectors is taken which results in a value close to +1 or -1. If it results in a value close to +1, the feature vector corresponds to a tumorous pixel in the input image. Thus, all the pixels in the input image are classified as tumorous or non-tumorous, by applying corresponding feature vectors to SVM classifier. A small number of pixels may be misclassified as tumorous because of presence of inhomogeneity during acquisition of image, and they may appear as a separately connected region away from main tumor. These small regions are removed by setting appropriate threshold with respect to area.
Once the tumorous pixels are determined, their positions in the input image can be easily determined by rearranging the output of SVM classifier, which is 57600x1. At those positions, the pixel intensity values of the input image can be forced to zero, which results in NROI image. Subtracting NROI image from the input image results in image consisting of only tumor region, i.e., the ROI image. Thus, the input image is split into ROI and NROI images.

ROI compression by arithmetic coder
In this case, ROI is the tumor region which undergoes lossless compression. The ROI image is converted to a vector and can be viewed as one message, with pixel intensity values representing symbols. Since intensity variation in tumor portion is minimum and this image has a large background with intensity value zero, we can achieve very high compression ratios when we use Arithmetic coding instead of Huffman coding. We could have considered only tumor related pixels for compression, but then, we need to code the coordinate values of tumor pixels in the case of transmission, which becomes an overhead. Moreover, when ROI image is reconstructed, first-hand information about the position of tumor is easily available.
We consider integer arithmetic coding which uses scaling [32] in the code generation. This technique finds an integer, 'num', which represents the message, whose binary form is transmitted. An interval [0, max_val) is created, where max_val is given by 2 n -1. Here, n is the number of bits in the codeword used to represent 'num', and it is chosen based on the total number of symbols in the message and the range required for the interval. When a first symbol is considered, a sub-interval related to this, is identified in the interval, with the lower-limit value and upper-limit value as defined in (1) and (2) respectively. This sub-interval becomes the new interval to be considered next.
Here, LLold refers to lower-limit of the interval before a symbol is considered, LLnew is lower-limit of the new interval after a symbol is considered, ULold refers to upper-limit of the interval before a symbol is considered, ULnew is upper-limit of the new interval after a symbol is considered, CumCount(si) is the cumulative count of symbols up to symbol si, from s1, TotalCount refers to count of all symbols in the message and s0=0.
When the next symbol is considered, a sub-interval is again identified in the present interval with its lower-limit and upper-limit values calculated as in (1) and (2), and this sub-interval becomes new interval. This process is continued till the end of the message and finally, the lower-limit of last interval due to last symbol transmitted, now referred as 'num' is transmitted in the form of binary codeword of n bits. Now, when the message length is too long, which is the case here, the interval range becomes smaller and smaller, which may result in the convergence of both the limits. To avoid this, rescaling of new interval is done in three circumstances; when both upper-limit and lower-limit of the interval lie in upper half of the previous interval, when both the limits lie in lower half of the previous interval and when both the limits lie in the middle half of previous interval. Whenever it is done, it is indicated by a bit-transmission contributing for the code of the message.

NROI compression using DWT-SPIHT-AC
The NROI is a non-tumorous region of the image, which is compressed lossily but maintaining acceptable quality after reconstruction. For this, we consider DWT-SPIHT-AC combination.

Discrete wavelet transform
We apply 2D-DWT to NROI which at first, involves the convolution of each row of the image with hLP(n) (lowpass filter coefficients) and gHP(n) (highpass filter coefficients), then eliminating odd-columns of the output array in each case, further concatenate the outputs to form a transformed row. When all rows are transformed, we consider the convolution of each column with the same filter coefficients, eliminating oddrows of output array in each case, further concatenate the outputs to form a transformed column. When all the columns are transformed, we have a transformed image which contains four subbands referred as LL (lowpass-lowpass), HL (highpass-lowpass), LH (lowpass-highpass) and HH (highpass-highpass) [33]. Figure 2 gives an idea of this transformation.
NROI image has size 240x240 (same as input except the tumor pixels taking zero values). Each of the subbands mentioned above has a size 120x120. Thus, the transformed image has 4 quadrants, with top left occupied by LL subband, which has average coefficients approximating input image itself, as it is the result of lowpass filtering in both horizontal and vertical pass. The top right quadrant represents HL subband, which indicates vertical edges in the image, as it is the result of vertical averaging of horizontal differences. Interpreting similarly, the bottom left LH subband indicates horizontal edges and the bottom right HH subband indicates diagonal edges in the image. This transformation of NROI is referred as single level decomposition. Only LL subband has significant coefficient values. Most of the coefficients of remaining subbands have insignificant values. We go for three level decomposition wherein we again apply 2D-DWT to LL subband only, to see that most of the coefficients in the transformed image are insignificant so that high compression ratio can be achieved. After three level decomposition, LL subband has size 30x30.  Table 1. These are drawn from Cohen-Daubechies-Feauveau (CDF) Biorthogonal wavelet families [33]. They are symmetric and can offer linear phase, which are essential in image applications.

Set partitioning in hierarchical trees (SPIHT) coder
SPIHT is a hierarchical algorithm [34] that is effectively used to encode the coefficients resulting from wavelet decomposition. The coefficients are part of a tree structure, which originates from LL subband at the coarsest level of decomposition. Arrays of 2x2 Coefficients are formed and each array is a set of offsprings of a parent coefficient residing in the corresponding position in the lower resolution band. However, LL subband coefficients at the coarsest level, though grouped as 2x2 do not form offsprings of any coefficient. Out of every 2x2 coefficients in LL band of coarsest level, the top left coefficient does not have offspring. The parent-offspring relationship is shown in Figure 3 ). Initially, LIP is assigned to set R, LIS holds set of coordinates of those elements of R which have descendants and LSP is kept empty. An MSB (Most Significant Bit) value is calculated which is given by: Here, Cmax is maximum among the magnitudes of DWT coefficients. A threshold, 2 b , is set based on MSB.
The encoding is done in passes and in each pass, the first step is sorting, which involves the following:  Comparison of coefficients whose coordinates are in LIP with threshold is done. If a coefficient magnitude is found to be greater than threshold, 2 b , then it is considered as significant and is indicated by bit '1', followed by one more sign-bit. If insignificant, it is indicated by a bit '0'.  The coordinate of significant coefficient is sent as element of LSP. After scanning all elements of LIP, elements of LIS are considered.  If the descendants of coordinates present in LIS are insignificant, then that descendant set is considered to be insignificant which is simply indicated by bit '0'.  If any of the descendants of coordinates present in LIS is significant then that descendant set is considered to be significant and is indicated by bit '1'.  If the significant set is of type D, then offsprings are compared with threshold, coordinates of significant ones are sent to LSP and those of insignificant ones are sent to LIP, after indication by suitable code bits. Remaining elements of the set is of type L and with a label L, they are moved at the end of LIS. If the significant set is of type L, we add each coordinate present in L at the end of LIS and consider them to be the roots of set D. The next step is refinement step, in which, coefficients of coordinates present in LSP but are not part of the current pass are identified, and their b th MSBs are output one by one. Here, the coefficients of coordinates which are added to LSP in the current pass are ignored as they are already indicated. This marks the end of one pass. After this, MSB value is decreased by one and the next pass in encoding is carried out. SPIHT allows us to control the bit rate, by regulating the number of passes in encoding.

Arithmetic coder
The encoded data coming from SPIHT block is a bit stream, which is compressed using Arithmetic coding. This results in increase of compression ratio and also this gives scope for considering a greater number of passes in SPIHT, thus contributing for increased quality of reconstruction.

Data combiner
This block just places ROI and NROI compressed bitstreams one after the other and make them available for storage or transmission through channel.

Reconstruction blocks
The data splitter receives the data from memory or channel, and splits it into ROI and NROI data. The ROI compressed bitstream is directed towards arithmetic decoder and NROI compressed bit stream is directed towards combination of arithmetic decoder, SPIHT decoder and Inverse DWT. The outcomes of ROI and NROI decompressions are superimposed to get reconstructed image. The arithmetic decoder [32] for ROI stream considers the same initial interval [0, max_val) as seen in encoder. The first n bits of bitstream received is assigned to 'numR' in integer form. Now, calculate a number, 'numS', given by: Decode that symbol, si such that CumCount(si)≤numS<CumCount(si+1). CumCount and TotalCount carry same meaning as discussed in encoder. LLint and ULint refer to lower-and upper-limits of the interval. The lower-and upper-limits are updated based on same (1) and (2), which were applicable in encoder. Rescaling of new interval is done in the same three circumstances and in the same way as in encoder. But in decoder, the value of numR is also scaled in the same way. Next, numS is calculated using Equation (4) and the process repeats till all symbols of the message are received. The output of decoder is given to Superimpose block.
The arithmetic decoder for NROI stream is the same kind of decoder as discussed for ROI stream, whose output is given to SPIHT decoder. In the SPIHT decoder, the initial threshold that was present in encoder is made available. Upon receiving coded bits from each pass, the decoder calculates positions of significant coefficients and their signs. The magnitude of a significant coefficient is calculated by multiplying 1.5 to the threshold of that pass and adding it to its previous value calculated in earlier passes. After receiving bits of all passes, the decoder would have found close approximation of DWT coefficients or transformed NROI image. This set of coefficients are applied to Inverse DWT block.
In the Inverse Discrete Wavelet Transform (IDWT) block, the top left portion of the transformed image, an area of 60 x 60 coefficients (wherein each subband having an area of 30 x 30) is the result of third level decomposition, which is first reconstructed by applying 2D-IDWT, to get the result of second level decomposition. Subsequently, the top left portion of transformed image with an area of 120x120 coefficients is reconstructed to get the result of first level decomposition and finally, the close estimate of NROI image is reconstructed using the whole (240x240) transformed image.
At a particular decomposition level, the transformed portion of image is processed column after column. A column is split into lowpass coefficients and high pass coefficients. Upsampling is done for both the sets and then, they are convolved with lowpass filter gLP(n) and highpass filter hHP(n) respectively and then added. After processing all columns, same procedure is repeated for rows to get 2D-IDWT of the transformed portion. Lowpass and highpass filter coefficients used are given in Table 2, drawn from CDF Biorthogonal wavelet families [33]. The output of IDWT is given to Superimpose block. In this block, the ROI and NROI images from outputs of respective decoders are superimposed to form reconstructed image.

PERFORMANCE PARAMETERS OF CLASSIFICATION AND COMPRESSION 3.1. Classification performance parameters
Accuracy: This indicates overall accuracy of the classification. It is defined as: Sensitivity: This indicates the portion of pixels which are identified to be tumorous by classifier, in those identified as tumorous by ground truth. It is defined as: Tumor center (TC): This indicates the location at which the tumor region is centered about. It is defined as: Here, TCH and TCV represent the horizontal and vertical coordinates of tumor center. NR represents the number of pixels in the tumor region, RT. hj and vj represent horizontal and vertical coordinates of j th pixel in RT. Lesser the difference between the tumor center obtained by classifier and that obtained by ground truth, more accurate the classifier is, in identifying the point where the tumor is concentrated.

Compression performance parameters
Compression ratio: This gives the idea of how much an image is compressed. It is defined as: Bits per pixel: This indicates the average number of bits representing a pixel value after compression. It is defined as: Mean square error (MSE): This indicates average of the squared error that exists between uncompressed image and reconstructed image after compression. For a P x Q image, it is defined as: I (x, y) and IR (x, y) are intensity values at location (x, y) corresponding to uncompressed image and reconstructed image respectively. MSE indicates quality of reconstruction.
Peak signal to noise ratio (PSNR): This parameter also, reflects the quality of reconstruction. It is expressed in terms of decibels (dB); hence we may conveniently specify the quality. It is defined as:

RESULTS AND DISCUSSION
The proposed hybrid compression based on automatic extraction of tumor is implemented on a computer system (with Intel i7 processor, 8 GB RAM) using MATLAB 9.2.0 (2017a). The images are axial MRI brain scans related to high grade gliomas and low grade gliomas, and are drawn from BraTS 2018 dataset [24]- [26]. About 120 MRI slices with tumor, concerning various cases were applied to proposed setup. From each MRI slice, the tumor (ROI) is automatically extracted using SVM classifier. The ROI is compressed using arithmetic coding (AC) and the compression ratio obtained is compared with other lossless techniques, Huffman coding and JPEG-lossless. The NROI is compressed using a DWT-SPIHT-AC combination. The hybrid compression technique, AC and DWT-SPIHT-AC in the set-up is compared with other hybrid techniques, AC and JPEG, and, AC and DWT-SPIHT, based on MSE and PSNR. Figure 4 shows the automatic extraction of ROI data from six chosen cases. Images of each case are available in a row. From the left, we observe the brain MRI slice that is input to proposed set-up, the image of tumor (ROI) extracted, followed by remaining part of image (NROI). The boundary of the tumor region is found and is  Values of performance parameters related to classification are shown in Table 3. For the listed cases, accuracy is around 0.99, dice coefficient is more than 0.87, positive predictive value (PPV) is 0.82 for only one case and is more than 0.95 in all other cases, sensitivity is close to 0.8, and tumor center detected is extremely close to that of ground truth. The same variability in values of performance parameters is seen in rest of the tested cases. The accuracy is high because of the detection of non-tumorous region almost correctly and also, non-tumorous portion occupy larger area in the image. Dice coefficient values signify that most of the tumor region detected is in agreement with the ground truth. PPV has excellent values in most of the cases indicating that those detected as tumor, are really the portion of tumor. Sensitivity has relatively more variation but indicative of false negative number being substantially less. Table 4 provides comparison of lossless compression techniques used for ROI (tumor region). We may observe that AC leads to high compression ratios as compared to Huffman coding (HC) and JPEG lossless compression. In NROI compression using a DWT-SPIHT-AC combination, the DWT is applied up to three levels and the resulting image is shown in Figure 5. The DWT results in a large number of insignificant valued coefficients and this is exploited by SPIHT. The SPIHT encoding allows the control of bitrate and hence, the loss in compression. The AC contributes for increase in compression ratios.   Figure 5. NROI portion of Image_1 after undergoing DWT up to three levels Table 5 indicates that for any given bpp, the MSE and PSNR values obtained using proposed set-up are better than other methods indicated, for all images. From the proposed set-up, PSNR varies from 32 to 37 dB at 0.25 bpp, and from 49 to 60 dB at 1.25 bpp. Also, MSE varies from 12.84 to 37.69 at 0.25 bpp, and from 0.06 to 0.79 at 1.25 bpp. The compression ratio is 32 at 0.25 bpp, and is 6.4 at 1.25 bpp. Figure 6(a) shows input image, followed by images obtained after reconstruction using different compression schemes. Figure 6(b) shows reconstructed image from AC (for ROI) and JPEG (for NROI), Figure 6(c) shows reconstructed image from AC (for ROI) and DWT-SPIHT (for NROI), and Figure 6(d) shows reconstructed image from AC (for ROI) and DWT-SPIHT-AC (for NROI) compression schemes. Reconstruction considered here is for the image compressed at 0.5 bpp. We may observe that the reconstructed image using AC and JPEG scheme has distortions at the ROI boundary while the reconstructed image using AC and DWT-SPIHT has reduced distortion. The reconstructed image from the proposed set-up is almost comparable to input image. The reason for better performance is that a greater number of bits can be allowed in SPIHT to reduce loss, but still maintaining high compression ratios because of the presence of AC after SPIHT. Though [ 21] deals with ROI based compression, input images used are quite different, and so, we compare average MSE and average PSNR values obtained using proposed method with those obtained in [21]. Table 6 indicates that PSNR values are significantly high resulting from the proposed method.

CONCLUSION
In this article, we propose a set-up for hybrid compression scheme based on automatic extraction of tumor. The extraction of tumor (ROI) using SVM classifier happens effectively, which is reflected by performance parameters. Arithmetic coding is used for ROI compression, and for NROI compression, DWT-SPIHT-AC combination is used. The images are effectively reconstructed, with a reasonably good MSE and PSNR values, which are evident from comparative analysis. The main contributions of this work are automatic extraction of tumor, compression at relatively low bpp leading to reasonably good PSNR and MSE values, and a set-up that reduces memory requirements and aids telemedicine. When we replace the SVM classifier by deep neural network based classifier, it may result in significant improvement in tumor segmentation and also, various regions in the tumor such as edema, active cells and necrotic core can be segmented. The extraction of ROI can be followed by description of tumor which gives the information about tumor size and location, which can be preserved or passed on as the side information through the channel aiding the expert to draw conclusions swiftly.