http://ijece.iaescore.com Deep learning for COVID-19 diagnosis based on chest X-ray

Info 2021 Coronavirus disease 2019 (COVID-19) is a recent global pandemic that has affected many countries around the world, causing serious health problems, especially in the lungs. Although temperature testing is suggested as a first-line test for COVID-19, it was not reliable because many diseases have the same symptoms. Thus, we propose a deep learning method based on X-ray images that used a convolutional neural network (CNN) and transfer learning (TL) for COVID-19 diagnosis, and using gradient-weighted class activation mapping (Grad-CAM) technique for producing visual explanations for the COVID-19 infection area in the lung. The low sample size of coronavirus samples was considered a challenge, thus, this issue was overridden using data augmentation techniques. The study found that the proposed (CNN) and the modified pre-trained networks VGG16 and InceptionV3 achieved a promising result for COVID-19 diagnosis by using chest X-ray images. The proposed CNN was able to differentiate 284 patients with COVID-19 or normal with 98.2 percent for training accuracy and 96.66 percent for test accuracy and 100.0 percent sensitivity. The modified VGG16 achieved the best classification result between all with 100.0 percent for training accuracy and 98.33 percent for test accuracy and 100.0 percent sensitivity, but the proposed CNN overcame the others in the side of reducing the computational complexity and training time


INTRODUCTION
The novel Coronavirus 2019 or COVID-19 is a pandemic that was first announced in Wuhan, China in December 2019 [1]. The clinical symptoms of COVID-19 include respiratory symptoms, high temperature, cough, dyspnea, and viral pneumonia [2], [3]. Doctors also diagnosed pneumonia, lung inflammation, abscesses and enlarged lymph nodes by X-rays and CT scans. The spread of COVID-19 has increased in several countries around the world, thus, new automated and effective methods of accurate detection are necessary in order to reduce the outbreak of the pandemic.
Detection of COVID-19 by using X-ray images is significant for diagnosis, evaluation, and treatment. Because COVID-19 targets the epithelial cells lining our respiratory tract, the health of a patient's lungs can be examined by X-rays [4]. All hospitals are equipped with X-ray imaging devices; X-rays could be used to test COVID-19. The obstacle is that X-ray analyses need a specialist in radiology and consume considerable time, which is a valuable commodity especially when there are many patients in the hospital New findings suggest that CT images are more helpful than X-rays for COVID-19 diagnosis, but it is costly. Moreover, it is easy to have an X-ray image, but it is difficult to diagnose from it. Therefore, expertise is needed. Our proposed method has the ability to diagnose any new X-ray image with promising results and high performance compared to the traditional diagnosis procedure.
This study first explores the state-of-the-art solutions to fight COVID-19. Then, we constructed a preprocessed and inclusive dataset on multi-source X-rays images and deliver a precise COVID-19 detection technique using deep learning and transfer learning techniques. Transfer learning is used widely in deep learning if there is a small dataset, where a model is reused for a particular task to another task. The transfer learning can be done in two categories: The "of-the-self" feature extraction, and fine-tuning [5]. In addition, the proposed CNN, modified VGG16, and InceptionV3 models are implemented as a pre-trained network on X-ray images. After extensive tests on the dataset, the proposed model reveals high accuracy and low training time for COVID-19 diagnosis. Our developed deep learning architecture can automatically extract and select important features from X-ray images for coronavirus detection and diagnosis to distinguish patients with coronavirus from those without the disease. Moreover, our proposed model has the capability to detect infection COVID-19 area in the lung.
The paper is organized as the following: Section 2 describes the chest x-ray dataset and the proposed methods for COVID-19 diagnosis. Section 3 present the performance evaluation metrics and experimental results. While section 4 discusses the obtained results and compared it with other researchers. the conclusion and future work are illustrates in section 5.

X-ray image dataset
The COVID-19 pandemic is a new infection virus and there is no suitable dataset accessible that has enough data for this study. Hence, by collecting chest x-ray images from two diverse image repositories, we had to create a dataset. COVID-19 X-ray images are accessible in the Github repository [6]. The repository contains open-source datasets of COVID-19 patients with chest X-ray images and is regularly updated. All of these images are extracted from 43 diverse publications. A reference for each image is provided in the metadata file in the same repository. Normal images were collected from another chest X-ray images (pneumonia) database [7] Figure 1 shows X-ray images for COVID-19 and normal cases. In our study, for COVID-19 positive cases, we used a metadata Excel file that contains the source of all X-ray images, then filtered the column "finding" to pick up the COVID-19 cases. Furthermore, under the column "view", the Posteroanterior view was selected which denoted as "PA". The positive COVID-19 cases were diagnosed by experts. We used tools such as Grad-CAM and Saliency to ensure and differentiate between positive and negative cases by answering the following questions: Why it is positive? Where is the infected region located? What is the confidence that it is positive? Researchers have observed that the lungs of patients with COVID-19 symptoms have some visual signs such as ground-glass opacities of darkened dark spots that can distinguish between COVID-19 infected patients and non-COVID-19 patients [8]- [10]. Subsequently, we created a folder named CovidDataset which contains two sub-folders named Train and Val. Each sub-folder contains two folders: one for COVID-19 images and the other one for normal images. In the Train folder, we have 224 images, where half of them are COVID-19 cases and the others are classified as All the images use the portable network graphics (PNG) format and have a different resolution. The images were dimensionally converted to 224x224 pixels to train it easily in the convolutional neural networks (CNNs). To increase the number of images, we used augmentation techniques which can prevent CNN from overfitting that can enhance the performance of our model and increase the accuracy. The augmentation techniques here include flip, rotate, translate, and scale for the image. The number of enlarged images make the dataset appropriate for CNN.

Models architecture and development
The main purpose of this study is to build a novel COVID-19 detection model based on X-ray images and the general form of our proposed network architecture is depicted in Figure 2. This section discusses three networks designed for COVID-19 detection. The first one is building a proposed CNN architecture from scratch as illustrated in Figure 3. The second one is based on the modified VGG16 architecture by using the transfer learning concept to transfer the knowledge or weights; in other words, from the pre-trained VGG16 network to our adaptive design by freezing the first layers which extract the general features from the image and modify the last layers to extract and select the specific features from the image. This procedure achieves a robustness result with a small dataset and overrode the overfitting problem. Besides VGG16, we used InceptionV3 as a pre-trained model to compare the performance between the three architecture networks. We used a global dataset as a benchmark that contains 284 X-ray images (of which 142 of them are positive COVID-19 patients and 142 images are for normal patients). Then, we generalized our result during the data augmentation stage to be valid for any new population. We used traditional Convolutional Neural Network (CNN) with some adaptive hyper-parameters and transfer learning for COVID-19 X-ray image classification and distinguished them from negative cases to be compatible with the clinical understanding of

The proposed convolutional neural network (CNN) model architecture and development
The CNN takes input tensors of shape (height, width, channels) for the images. The configuration is designed to process inputs of size (224, 224, 3). This was done by passing the values to the input shape argument in the input layer. Our CNN architecture has four convolutional layers. We used the same kernel size during the network (3×3) and increasable learnt numbers (32, 64, and 128). Moreover, we used three Max pooling in the network to reduce the spatial dimensions of the output volume that lead to reducing the number of parameters and training time. Also, dropout layers were used to treat overfitting, By disabling the neurons at random during the learning process. In addition, ReLU was used as an activation function through the hidden layers. The ReLU improves the neural network by speeding up the training process.
In the end, we used flatten to get the image dimensionality down to 1D and add 2 dense fullyconnected layers. The dense layers processed 1D image vectors for our output. The last activation function used was sigmoid, which is suitable for binary classification. In our study, the final decision is positive COVID-19 or normal. Furthermore, we configured a set of parameters such as batch size is set to 32, the number of epochs is set to 30, learning rate is initialized to 3e-4, the pool size factors in maxpooling2D are tuple of two integers and set to 2 for both, dropout rate in each layer is 0.25 and in the last layer is 0.5, and finally the validation step has been set to 2.
In the compilation step, we used binary_crossentropy as the loss function and Adam as the optimizer for this model. The optimizer is used for modifying the weights of the neurons through backpropagation. It computes the derivative of the loss function with respect to each weight and subtracts it from the weight. That is how a neural network learns. The number of parameters used in the model is 5,668,097, and all of them are trainable because we did not used a pre-train model. The low sample size at the first stage of coronavirus was considered a challenge. We were able to override this issue by using data augmentation techniques such as transformation, batching, flips or crops. Moreover, early and rapid detection of positive cases of coronavirus can enhance the handling of the patients and decrease the spread of the virus.

The modified VGG16 and inceptionV3 networks
VGG16 considers deep convolutional neural network. It was proposed in 2014 by Simonyan [11]. The network has 16 convolutional layers with small filter size (3x3), 144 million parameters, 5 max-pooling layers (2x2 size) and 3 fully-connected layers and soft-max activation function with the final layer. This model pre-trained on ImageNet dataset and then the modified weights were transferred to update the fully connected layers of the new network. In our model, we decreased the number of convolutions and changed the structure of fully connected layer by using flatten, dropout and two dense layers, and sigmoid activation function for binary classification to distinguish between COVID-19 and normal cases. The number of parameters was decreased to 16,320,449 and by using the transfer learning technique, the number of trainable parameters dramatically declined to 1,605,761 by freezing the weights in the first part of the network and transferring the knowledge. Hence, the challenge of low size data was overcome and the overfitting problem was solved. At the same time, the training time sharply declined. At the end, we used binary_crossentropy as loss function and adam as optimizer.
The low sample size due to the expensive process in acquiring data, especially in the medical field, encouraged us to utilize data augmentation techniques in our study. Different augmentation techniques were used such as shifting, zooming, flipping, rotating, sharpen (lightness value), Gaussian blur (sigma value), edges detection (alpha value), emboss (strength value), skew (tilt) and shear (axis and value) are used to increase the data volume to obtain accurate performance [12]. In our study, we used some of them. Deep learning requires a big data volume. Moreover, increasing the amount of data can help us overcome the overfitting issue and enhance the model performance. Furthermore, transfer learning also was utilized in the second model by fine-tuning VGG16. The early layers in CNN extract generic features, but the last layers extract specific features [13]. Therefore, some early layers were fixed in our models and the fixed layers were excluded during the training of the models. In both neural network architectures (VGG16 and Inception) the early layers responsible for feature extraction and selection are frozen and unfreeze the final block which constructed from flatten, dropout, and two dense layers. Hence, the number of parameters was decreased and the computational complexity and training time also declined.
The third architecture used in this study is InceptionV3, which was developed for the GoogLeNet model by Google researchers in 2015 [14]. The goal of this network is to act as a multi-level feature extractor with increasable filter size in each convolutional layer. Furthermore, the weights for InceptionV3 are smaller than VGG16.

EXPERIMENTAL RESULTS
In our experiments, the neural networks were trained using python (Keras and TensorFlow as a backend); the training procedure was done by utilizing the GPU feature which is available in google colab. All the experiments are performed on an Intel i7 2.7 GHz CPU with 16 GB of RAM. The measures of performance used in this study are accuracy, sensitivity, specificity, precision, and F1-score as shown.
The accuracy is the ability to distinguish between the classes correctly by the classifier, while sensitivity denotes its capability to accurately identify the true positive. Specificity evaluates the actual negatives that are correctly identified by the classifier. Also, true positive (TP) is the number of correctly classified COVID-19, false positive (FP) is the number of normal X-ray images that have the wrong classification as COVID-19, false negative (FN) is the number of COVID-19 X-ray images wrongly labeled as non-COVID-19 and true negative (TN) is the number of truly identified non-COVID-19 (normal) cases.
To evaluate the efficiency of the proposed method, we performed both qualitative and quantitative analysis to get a better understanding of its detection performance and decision-making behavior. First, all images were resized to 224x224 to be suitable with the VGG16, InceptionV3 and the proposed CNN. Moreover, all image pixels were normalized in range [0,1]. The same parameters to each network were used, such as the number of epochs which was set to be 30, the batch size equals to 32, learning rate set to be 0.0003 and binary crossentropy as loss function. Moreover, we used different optimizers such as adam and RMSprop and then decided which one is better according to the accuracy and model performance. In our experiments, we used adam optimizer in all the models because it gave us better results than RMSprop based on the dataset. GPU was used in our experiment to accelerate the training time. Table 1 illustrates the training time for each model. It is clear that our proposed CNN has the lowest training time and VGG16 has the worst training time. On the other hand, based on the number of parameters in the pre-trained models InceptionV3 and VGG16 were very high, but by using transfer learning, the number of parameters declined sharply owing to freezing some of layers in the beginning of the network. Consequently, the number of parameters needed to train was dramatically reduced from 25,079,713 to 3,276,929 and from 16,320,449 to 1,605,761 in InceptionV3 and VGG16 respectively.  Table 3 illustrates the validation results of the three neural networks by accuracy, specificity, sensitivity, precision and f1_score metrics. It is noticeable that VGG16 had the best result by comparing with other two models without any consideration to the computational complexity. Although our proposed CNN architecture achieved the second performance, it has significantly low training time. The confusion matrices for the three neural networks of modified inceptionV3, modified VGG16 and the proposed CNN are presented in Figure 4 It is clear that VGG16 and the proposed CNN achieved 100% COVID-19 diagnosis. The two models have the ability to classify all true positive cases in the validation dataset and have promising results in classifying the true negative cases. Furthermore, Figure 5 shows the plot of training and validation accuracy and loss on COVID-19 X-ray images dataset for the pre-trained models inceptionV3, VGG16 and proposed CNN. The line graph in the first model fluctuated despite using low learning rate that gave us the indication that it cannot determine the global optimum and faced difficulties in dealing with local optimum. In contrast, for the other two models, convergence procedure was done perfectly.
In the beginning of this section, we talked about qualitative analysis to evaluate the efficiency and get a better understanding to how the model can classify positive COVID- 19  used a technique called Grad-CAM (gradient class activation map). This technique is used to detect the importance of a particular class in our model, we just take its gradient concerning the final convolutional layer and next weighed it upon the output of this layer. It is clear in Figure 6 that the Grad-CAM has the ability to locate the location of the infected region inside the chest accurately.

DISCUSSION
In this study, we proposed new convolutional neural network architecture from scratch and two deep models based on VGG16 and InceptionV3 architectures to detect COVID-19 from the chest X-ray images. Our models were tested on the validation set and achieved an accuracy of 81.67%, 98.33% and 96.66% for InceptionV3, VGG16 and the proposed CNN respectively. Furthermore, both VGG16 and the proposed CNN achieved 100% for sensitivity to decrease the false negative COVID-19 cases prediction and increase the true positive COVID-19 cases prediction which was our goal.
In comparing our results with other studies in literature presented in Table 4, our study used X-ray images, but some studies used chest CT images. The table includes both of them. Some studies which achieved high accuracy used the training set. In contrast, our study used a validation set for more reliability. Gozes et al. [15] proposed automated diagnosis and patient monitoring system using deep learning and CT scan images investigation. The author utilized ResNet-50 architecture to automatically classify COVID-19 as positive or negative cases. The proposed method outperformed the professionals in this field with accuracy of 99.6% and 98.2 % for Sensitivity. Chen et al. [16] built a deep learning model for COVID-19 pneumonia detection based on high-resolution CT images. The model design and validation resulted in 46,096 anonymous images of 106 patients admitted to the clinic, including 51 COVID-19 confirmed laboratory patients and 55 other diseases. As a result, they achieved 95.24% accuracy.
Narin et al. [17] performed the experiment with three separate CNN models (InceptionV3, ResNet50, and InceptionResNetV2). All of them were pre-trained on ImageNet dataset but for 2-class classification, ResNet50 achieved the best accuracy of 98%. The experiment did not have pneumonia cases. Hence, the model may not distinguish between COVID-19 and other pneumonia diseases. L. Wang et al. [18] proposed a new architecture neural network named COVID-Net for COVID19 detection. The model achieved a 92.4% accuracy.
Apostolopoulos et al. [5] demonstrated the superiority of mobile net than building CNN from scratch in decreasing the false negatives for COVID-19 detection. From the comparison between the two models, it is clear that building the model from scratch takes a lot of time and requires complex computations. On the other hand, the author used 10-fold repeatedly which is considered as time consuming but more accurate. Moreover, pulmonary diseases have many types. The author used six types of them in the dataset and added COVID-19 as the seventh disease. He achieved 99.18% accuracy, 99.42% specificity, and 97.36% sensitivity in the detection of COVID-19, but for the result for training procedure, the more reliable result should be on the validation set. Apostolopoulos and Mpesiana [19] published another paper about automatic detection from X-ray images utilizing transfer learning. They used VGG19 and MobilNetV2 and achieved a 98.75% and 97.4% accuracy respectively for the 2-class problem. VGG19 and MobilNetV2 are very deep and have a lot of parameters that needs a long time for raining. Sethy et al. [20] used ResNet50 architecture with SVM in the fully connected layer for COVID-19 detection from chest X-ray images. ResNet50 model works as a feature extractor and SVM acts as a classifier in the fully connected layer. The model was applied on a 2-classes dataset and achieved a 95.38% accuracy. Hemdan [21] used different pre-trained models to detect COVID-19 cases from chest X-ray images and introduced a COVIDX-Net model involving seven CNN models. Ozturk et al. [22], proposed a new CNN architecture named DarkNet architecture for COVID-19 detection from X-ray images. Their model achieved 98.08% accuracy for a 2-class problem. ChexNet [10] is a new convolutional neural network that was proposed for chest X-ray pneumonia detection. ChexNet has obtained outstanding results that has surpassed the overall output of the radiologist. ChestNet is another similar solution. ChestNet is an advanced type of CNN intended to detect chest diseases on chest radiography images [27]. Uncertainty estimating by using Bayesian convolutional neural networks (BCNN) and Resnet5 and OV2 as pre-trained models to transfer the knowledge was proposed by Ghoshal and Tucker [28] to enhance the performance of the diagnosis by using COVID-19 chest X-ray images. The author used the predictive uncertainty (PH) concept to avoid false predictions and augmented different datasets from different sources to increase the amount of data.
CoroNet was a proposed CNN architecture by Khan et al. [23]. The author compared his technique which is CoroNet with another technique which is COVID-net and has achieved better results from it. Furthermore, the author combined two public datasets from different sources. He had 1,300 images which he enlarged using augmentation techniques. Hence, the achieved accuracy was 89.5% for 4-class cases, 95% for 3-calsses and 99% for binary class. The performance can be further enhanced once additional training data are available. Furthermore, CoroNet also needs to undergo clinical trials. Our proposed method will be trained to classify coronavirus versus non-coronavirus cases. According to the robustness of the convolutional neural network in image classification, our proposed method outperforms expert diagnosis especially when we have a large amount of data. Commonly, CNN gives a high percentage of accuracy in image classification of usually more than 90% accuracy, sensitivity and specificity. Our proposed method can be utilized in hospitals as a first-line for coronavirus diagnosis. The main contribution of our research is to enhance COVID-19 diagnostics by reducing the computational complexity which leads to a decrease in training time, improve the sensitivity by increasing the numbers of true positive, and reduce the numbers of false negative. In addition, the learning process was improved by using different augmentation techniques.

CONCLUSION
In this study, we proposed a new CNN architecture and compared its performance with other pretrained neural networks (VGG16 and inceptionV3) for COVID-19 detection based on X-ray images. COVID-19 is still a new pandemic and the lack or low sample size is a challenge, especially when using deep learning because it needs big data to boost the performance. Hence, data augmentation with different techniques was used. Moreover, transfer learning was also used with fine-tuning to override the low sample size challenge. The result of the validation set showed that VGG16 and our proposed CNN outperforms InceptionV3. Although VGG16 is slightly better in both training and testing accuracy, our proposed model overcame other networks in computational complexity by reducing the number of hyper-parameters that lead to reducing training time significantly. For future work, a design technique is needed that can analyze the image to measure the percentage of lung volume that is infected by the disease. In addition, the performance of the proposed model was not compared with radiologists. So, future studies could plan to compare it with radiologists.