Classification of COVID-19 from CT chest images using convolutional wavelet neural network

Analyzing X-rays and computed tomography-scan (CT scan) images using a convolutional neural network (CNN) method is a very interesting subject, especially after coronavirus disease 2019 (COVID-19) pandemic. In this paper, a study is made on 423 patients’ CT scan images from Al-Kadhimiya (Madenat Al Emammain Al Kadhmain) hospital in Baghdad, Iraq, to diagnose if they have COVID or not using CNN. The total data being tested has 15000 CT-scan images chosen in a specific way to give a correct diagnosis. The activation function used in this research is the wavelet function, which differs from CNN activation functions. The convolutional wavelet neural network (CWNN) model proposed in this paper is compared with regular convolutional neural network that uses other activation functions (exponential linear unit (ELU), rectified linear unit (ReLU), Swish, Leaky ReLU, Sigmoid), and the result is that utilizing CWNN gave better results for all performance metrics (accuracy, sensitivity, specificity, precision, and F1-score). The results obtained show that the prediction accuracies of CWNN were 99.97%, 99.9%, 99.97%, and 99.04% when using wavelet filters (rational function with quadratic poles (RASP1), (RASP2), and polynomials windowed (POLYWOG1), superposed logistic function (SLOG1)) as activation function, respectively. Using this algorithm can reduce the time required for the radiologist to detect whether a patient has COVID or not with very high accuracy.


INTRODUCTION
Globally, in April 5, 2022, there had been 489,060,735 confirmed cases of coronavirus disease 2019 (COVID-19), including 6,150,333 deaths, reported to the World Health Organization [1]. COVID-19 represents the greatest global public health emergency since the pandemic of Spanish influenza in 1918 [2]. Earlier COVID-19 diagnosis plays a crucial role in not just patient management but also the prevention of the further spread of the disease [3], [4]. COVID-19 spreads rapidly and needs early prediction and diagnosis to decrease deaths caused by it [5], [6]. Several recent studies mention that artificial intelligence (AI) and machine learning can be used to deal with COVID-19 [7]- [12], while the others suggested that using convolutional neural network (CNN) to detect and classify COVID-19 based on computerized tomography (CT) scan images [13]- [18].
Chest CT imaging may be utilized to classify COVID-19 infected individuals early [19]. Diagnosing COVID-19 using a CT scan is better and more accurate than other methods [20], [21]. CNN is widely used in  [24] used a chest X-ray image for the diagnosis of COVID-19 using deep CNN architecture. Szandała [25] says, "The primary neural networks decision-making units are activation functions. Moreover, they evaluate the output of networks neural node; thus, they are essential for the performance of the whole network. Hence, it is critical to choose the most appropriate activation function in neural networks calculation." In [26], all of the CT-scan images were sorted in a way to exclude the unnecessary CT-scan images in order to reach the desired solution in a more efficient way, so in this paper, this method is also followed. In this paper, a simple CNN is proposed with wavelet function as an activation function to classify COVID-19 cases based on CT-SCAN images. Using this algorithm can help the radiologist examine and report a huge number of daily cases with much less time than if done manually (regularly, it takes ten minutes to examine and report a single case).

THE PROPOSED CONVOLUTIONAL WAVELET NEURAL NETWORK
The architecture of the CWNN used in this work is simple and is composed of only 12 layers, as shown in Figure 1. Increasing CNN complexity and length (more than 12 layers) is time-consuming and may lead to overfitting. However, simple CNN may not perform as accurately as complex CNNs in complicated problems such as COVID-19 classification.
In addition, the wavelet function has many features that give the ability to find out important details on a small scale. In fact, wavelet is successful to assess on both small and large scale together. So, in this work a convolutional wavelet neural network (CWNN) structure is proposed that is a combination of a simple CNN and wavelet function as the activation function to obtain an easy and accurate system that can deal with complicated problems.  Figure 1, the proposed CWNN includes twelve layers. The first layer consists of an input image convolved with 32 filters of size 3×3. The second layer consists of an activation layer, which will be one of the wavelet functions. The third layer is a max pooling layer with filters of size 2×2. The fourth layer consists of convolution operation with 64 filters of size 3×3. The fifth layer consists of an activation layer, which will be one of the wavelet functions. The sixth layer is a max pooling layer with filters of size 2×2. The seventh layer consists of convolution operation with 128 filters of size 3×3. The eighth layer consists of an activation layer, which will be one of the wavelet functions. The ninth layer is a max pooling layer with filters of size 2x2 followed by a fully connected layer with 128 units and activation layer which will be one of the wavelet functions, respectively. The last layer is a SoftMax output layer with 2 possible classes. As an activation function, several wavelet functions were proposed and used for example rational function with quadratic poles (RASP1), (RASP2), polynomials windowed (POLYWOG1), and superposed logistic function (SLOG1). The model was compiled with stochastic gradient descent (SGD) optimizer with 0.001 learning rate and 0.9 momentum. Categorical cross entropy was used.

Testing the proposed system
In order to test the proposed CWNN algorithm, a dataset is required. The dataset used is composed of CT-scan images with COVID and non-COVID cases. These CT-scan images were collected from Iraqi hospital with the help of a specialist. These images were taken from 423 patients where all of the private information was excluded.
The CT-scan images for each patient include at least 50 images, like a movie that show different positions of the patient's lungs as shown in Figure 2. For each patient, these images were sorted very carefully in order to exclude images that do not give a good insight into the patient's lungs and only images that have the lungs open and clear were chosen (lung dependent areas). The chosen images are good for examination and lead CWNN to accurate results instead of going through images that are non-beneficial. The total data collected was originally 23,472 and after sorting them they became 15,000 images.  All information about the dataset is shown in Figures 4 and 5 in detail, which illustrate the age groups and genders of different patients, respectively. Figure 4 shows that the data samples are chosen for both males and females. This indicates that the algorithm can work on both genders. Figure 5 shows that the data samples take different age groups of the population, indicating that the algorithm can perform on all the population and not just on a specific age group like adults only. Table 1 shows a detailed description of the dataset where the non-COVID patients' number is 284, the number of CT scan images for these patients is 15680, and the chosen images are 10,000. Among these patients are 138 females and 146 males. While the COVID patients' number is 139, and the number of CT scan images for these patients are 7,792 and the chosen images are 5,000. Of these patients, 58 are females and 81 are males. So, the total patients' number is 423 and the total number of CT scan images for these

Preprocessing
In the preprocessing stage, all the images chosen in the dataset are resized to 224×244 pixels and also normalized. The normalized data is now divided: 70% of the data is used for the training stage, 10% of the data is used for the validation stage and 20% of the data is used for the testing stage. Because the dataset images include more than one image for the same patient, these images are either in training, validation, or testing to avoid overlapping.

Evaluation metrics
Four metrics are used to evaluate the performance of the proposed model on the test dataset, these metrics are: i) true positive (Tp) for the COVID images has been classified (predicted) correct as COVID; ii) false negative (Fn) for the COVID images has been classified incorrectly as non-COVID; iii) true negative (Tn) for the non-COVID images has been classified correctly as non-COVID; iv) false positive (Fp) for the non-COVID images has been classified incorrectly as COVID. These metrics are described by the confusion matrix [22] shown in Figure 6. Accuracy, precision, sensitivity, specificity, and F1-score [26], [27] are calculated using (1)- (5).

Experimental setup
Python language 3.9.7 is used in this paper. Also, TensorFlow 2.7.0 framework is used in this project. This was accomplished using an Intel Core i5 Lenovo computer running Windows 10 Pro (64-bit operating system).

RESULTS AND DISCUSSION
The proposed CWNN's efficiency is tested by calculating the four metrics (Tp, Fn, Tn, and Fp). For comparison with other previous models, the same dataset is trained and tested with the same model architecture but with other activation functions (exponential linear unit (ELU), rectified linear unit (ReLU), Swish, LeakyReLU, Sigmoid). The calculated four metrics values are shown using the confusion matrices as in Figure 7. Figure 7(a) shows that when RASP1 is used as the activation function, only 1 COVID-19 out of 995 is predicted as non-COVID. As shown in Figure 7(b), using POLYWOG1 as the activation function, all COVID-19 out of 995 are predicted as COVID. Figure 7(c) shows that when RASP2 is used as the activation function, only 1 COVID-19 out of 995 is predicted as non-COVID. Figure 7(d) shows that, when SLOG1 is used as the activation function, only 7 COVID-19 out of 995 are predicted as non-COVID. On the other hand, when ELU is used as the activation function, 41 COVID-19 out of 995 is predicted as non-COVID, as illustrated in Figure 7(e). Using ReLU as the activation function, 65 COVID-19 out of 995 is predicted as non-COVID as shown in Figure 7(f). Using Swish as the activation function, 53 COVID-19 out of 995 is predicted as non-COVID as illustrated in Figure 7 79 COVID-19 out of 995 is predicted as non-COVID as illustrated in Figure 7(h). Figure 7(i) shows that when the Sigmoid is used as the activation function, 248 COVID-19 out of 995 is predicted as non-COVID. The efficiency metrics are calculated again to compare them with the proposed CWNN as shown in Table 2. Overall, CWNN gave better accuracy, precision, sensitivity, specificity, and F1-score compared to others. From Table 2, an evaluation is made in which the accuracy of the proposed CWNN is a lot higher than different approaches, in which, as an example, the accuracy of the proposed CWNN with RASP1 activation function is 0.9997, whereas the accuracy when employing CNN with Swish activation function is 0.9252. Thus, the preceding instance shows that the use of CWNN is better than other CNNs.
From the confusion matrices in Figure 7 and all the numbers in Table 2, it is obvious that CWNN with a wavelet activation function achieves better prediction than CNN with other activation functions. All the findings obtained from applying the algorithm of the proposed CWNN were as expected, that is, higher than regular CNN due to using the wavelet activation function. These outcomes are obtained because of the features of the wavelet function that give a higher overall performance for the system.

CONCLUSION
In this paper, a simple CNN with a wavelet function (RASP1, POLYWOG1, RASP2, SLOG1) is used for predicting the COVID-19 disease and a prediction accuracy of 99.97%, 99.97%, 99.9%, and 99.04% respectively is achieved. The same CNN structure is used with other commonly used activation functions (ELU, ReLU, Swish, LeakyReLU, Sigmoid), and the results using the CWNN achieve better prediction and accuracy results. Using this algorithm is very helpful and precise in detecting COVID cases with high speed, therefore reducing the time required by the radiologist to send a report and especially in hospitals with a large number of patients. In the future, it is planned to use the proposed model in automated prediction systems and to predict the cases' severity and their prognosis (future outcome). It is also planned to test CWNN for other predicted problems.