Enhancing COVID-19 forecasting through deep learning techniques and fine-tuning

ABSTRACT


INTRODUCTION
Time series forecasting (TSF) [1] is a process of analysing sequences of regularly collected observations to uncover hidden relationships and predict future data points.However, time series data are non-stationary, making it challenging to identify patterns due to temporal dependence between observations.To address this issue, mathematical transformations can be applied to make the data stationary.There is a wide range of tools and techniques available for TSF, making it difficult to select the most suitable model.Linear regression [2]- [4] is a widely used method for modelling the relationship between a response variable and one or more explanatory variables.Simple linear regression is employed when there is only one explanatory variable, while multiple linear regression is used for more complex scenarios.
Several studies have specifically focused on forecasting coronavirus disease of 2019 (COVID-19) [5]- [7] cases and evaluating the risk of death in highly affected countries.The autoregressive integrated moving average and wavelet-based forecasting (ARIMA-WBF) [8] combines the models of ARIMA with wavelet forecasting to generate accurate short-term real-time forecasts, which outperforms the traditional

935
ARIMA models.To identify critical variables that significantly impact COVID-19 death rates, such as total cases, demographics, and healthcare resources, regression tree algorithms [9] have been employed.In healthcare applications, dealing with incomplete datasets is a common challenge.Many studies opt to discard incomplete samples or missing features, which leads to a reduction in the size of the training data.However, alternative deep-learning training approaches have been proposed to effectively handle imperfect datasets.Artificial neural networks (ANN) [10], which come under the category of deep neural networks, have gained attention in recent years.These ANN models, such as the multi-layer perceptron (MLP), are capable of learning relationships between linear and non-linear data.The MLP consists of input, hidden, and output layers and is trained using the backpropagation algorithm to minimise the difference between predicted and desired output vectors.However, it should be noted that the MLP has certain limitations, including a fixed number of inputs and outputs.Convolutional neural networks (CNNs) [11]- [13] are another type of ANN commonly used for image classification and object detection.They have demonstrated exceptional performance in computer vision applications and can greatly enhance hybrid models for solving various problems.In the context of COVID-19, time series forecasting techniques are been applied to predict the progression of the disease, estimate mortality rates, and improve overall patient care.Various models have been developed, including recurrent neural networks (RNN), stacking ensemble techniques, and graph neural networks, which incorporate mobility data [14].These models have shown comparable or even better performance when compared to more complex approaches.Additionally, machine learning techniques have been extensively employed to project COVID-19 infections and deaths, aiding healthcare providers in planning and preparation.Models such as RNN have been utilised, and evaluation metrics including root mean square error (RMSE) and bayesian information criterion (BIC) have been used to assess their performance.
The objective of this study is to analyse different deep-learning approaches for forecasting COVID-19 data and select the most appropriate one for further fine-tuning.The aim is to develop an effective model for forecasting COVID-19 data, which can also be applied to other emerging diseases serving as potential international public health threats, leading to another global crisis.The main scientific-technical objectives of this work are as follows: i) to investigate the feasibility of applying traditional linear regression methods to TSF, ii) to analyse different deep learning approaches for forecasting COVID-19 data, iii) to select the most suitable deep learning approach to be trained and applied to the COVID-19 domain.
This paper is divided into several sections, each serving a distinct purpose.The introduction section 1 provides an overview of the topic and outlines the motivation for the study.Furthermore, section 2 describes the methodology or approach used in conducting the study.The next section presents the results and discussion, accompanied by a comprehensive analysis and interpretation of the results.In order to enhance the discussion, the section incorporates tables, graphs, or figures that facilitate comparisons with previous studies or theoretical frameworks.Lastly, the conclusion section section 4 summarises the key results and discusses their implications.

METHOD 2.1. Dataset and data exploration
COVID-19 data was collected from an open online repository that contained official records since the start of the pandemic.Specifically, the data was obtained in JavaScript object notation (JSON) format through hypertext transfer-transfer protocol (HTTP) requests using the "narrative COVID-19 tracking" project application programming interface (API).These data served as input for three distinct neural network models to predict various outcomes, including death or recovery rates, and support clinicians in addressing both the current and potential future pandemics or diseases.The collected data were processed and transformed into a dataset using Python.
To compare data from Spain with other major European countries such as Germany, France, and Italy, the analysis incorporated global data from each of the countries.Instead of focusing solely on provincial data within Spain, the dataset encompassed COVID-19 data at a broader country level.In other words, each row of the dataset represented global COVID-19 data for a particular day, including metrics such as active cases, deaths, and recoveries for all countries considered.
The dataset used for analysis consisted of 21,959 records of COVID-19 health data collected between May 1, 2020, and October 31, 2021.Initially, the dataset contained 18,544 rows, but additional records were added to address overfitting issues during the model tuning phase.The dataset included various attributes for each record, such as the specific date (date), a standardised country or area code (M49 code), the total number of people infected with the COVID-19 virus (today new confirmed), the number of newly reported COVID-19 cases (today new confirmed), the total number of deaths attributed to COVID-19 (

Time series methods
In the field of TSF, the autoregressive integrated moving average (ARIMA) model [15], [16] was commonly used.This model combined AR, differencing (I), and moving average (MA) components.The AR component modelled the relationship lagged observations (p), the differencing component removed trends (d), and the MA component modelled short-term fluctuations (q).The notation for ARIMA was ARIMA (p, d, q), where p, d, and q represented the order parameters.ARIMA assumed linearity, stationarity, and the absence of seasonality in the data.Meanwhile, linear regression models were commonly employed to eradicate trends by considering consecutive observations and stabilise time series, they had limitations in handling uncertain data, analysing complex patterns, and supporting multiple input variables or multi-step forecasts [14], [17].Recently, machine learning techniques, particularly those based on ANN, have gained popularity in TSF.ANN simulated neural networks in the brain, and the selection of activation functions played a crucial role in their operations.Common activation functions included the rectified linear activation (ReLU), sigmoid function, and hyperbolic tangent (Tanh).These functions transformed inputs into outputs and contributed to the decision-making process within neural networks [18].
Among the models used in time series forecasting, the multi-layer perceptron (MLP) [19]- [21] stood out as an exceptional choice.It was a fully connected feed-forward network with the capability to learn relationships between linear and non-linear data.However, MLPs had limitations concerning fixed input and output quantities [22].CNNs [23] which drew inspiration from biological neural networks, demonstrated remarkable performance in image classification and could also be employed in time series forecasting.CNNs consisted of convolutional, pooling, ReLU activation, and fully connected layers, enabling them to effectively capture features in time series data [24].RNNs [25] were another type of ANN that incorporated information from previous time steps, allowing them to learn time dependencies.RNNs consisted of an input layer, hidden layer, and output layer, sharing parameters throughout the network.They were particularly well-suited for processing sequential data.Long short-term memory networks (LSTMs) [26] were a specialised form of RNNs that excelled in capturing long-term dependencies.LSTMs could retain important information over extended periods, making them invaluable for tasks involving sequence prediction.
Each of these ANN models offered distinct advantages and capabilities for TSF.The selection of the appropriate model depended on the specific characteristics and requirements of the problem at hand.The MLP was widely used and had the ability to learn relationships between linear and non-linear data.CNNs were particularly effective at capturing features in time series data, while RNNs were well-suited for processing sequential data and learning time dependencies.LSTMs referred to specialised RNNs that excelled at capturing long-term dependencies.The choice of model should be based on the unique characteristics and needs of the forecasting problem being addressed.

Data preparation
The objective of this step was to enhance the quality of the input data.Firstly, the rows containing only zeros were removed to prevent underfitting, then the data input was transformed into a time signal to represent the periodic nature of the date.As dates in their string form were not useful, they were converted into weeks.Subsequently, a split of 70%, 20%, and 10% was used to divide the data into training, validation, and test sets.It was crucial to acknowledge that the dataset instances were not randomly shuffled before the split.This approach facilitated the creation of time windows comprising consecutive instances.After splitting the data, a z-score standardization was applied, which involved subtracting the mean and dividing by the standard deviation of each feature.Since the neural network models made predictions based on a window of consecutive samples from the data, without prior shuffling, it was crucial to consider the features within the input window, which included: i) the width (number of time steps) of the input and label windows, ii) the time offset between the input and label windows, and iii) the features used as inputs, labels, or both.

Persistence model
A mean absolute error of 1.4120 and an error loss of 7.7954 were obtained, indicating a significant variance in the data caused by overfitting.However, it should be noted that these predictions do not accurately reflect the performance of the model because the neural network had not been built.The subsequent step in this project involved exploring various neural network models to evaluate their behaviour.Once the best model was identified, fine-tuning was performed to achieve optimal results.

Simple linear regression model
By employing a simple linear regression model, a mean absolute error of 0.9260 and a loss of 4.0728 were attained.This error value was considerably higher compared to the typical mean absolute error in machine learning, which was around 0.05.The high mean absolute error suggested the potential presence of overfitting, where the model was excessively tailored to the training data and failed to generalise properly to unseen data.

Single-step neural network model
In the single-step neural network model with two layers and (64,64) neurons, a mean absolute error of 1.0605 and an error loss of 4.8579 were obtained.These values indicated an improvement compared to the previous model, highlighting the superior performance of the current model with its more intricate architecture.When a neural network with two layers and (60,30) neurons was considered using the sigmoid activation function, a decrease in mean absolute error, measuring 0.9572 was achieved, accompanied by an error loss reduction of 3.9366.Considering the uncleaned nature of the dataset and the absence of optimal model selection, these results could be considered satisfactory.Finally, the implementation of a neural network with dimensions 8×8 and using the SoftMax activation function led to further error reduction, yielding a mean absolute error of 0.9124 and an error loss of 4.0662.

Multi-step neural network model
To capture the temporal dynamics and account for the evolving nature of the data, a multi-step dense model was developed.This model could process multiple inputs across different time steps, enabling predictions that incorporate the temporal evolution of the data.Upon evaluation, the multi-step dense model produced a mean absolute error of 1.0655 and an error loss of 3.8845.These results showed that employing a densely connected network may have been the most optimal approach for analysing the COVID-19 dataset, as indicated by the observed error values.Furthermore, a comparison between the validation and training errors revealed that the model continued to exhibit signs of overfitting.The error values closely resembled those obtained from the previous model, indicating that the modifications made were not successful in effectively addressing the overfitting issue.

CNN model
A CNN, typically designed for two-dimensional image data, could also be adapted for forecasting univariate time series problems, making it a suitable choice for the current situation.This network type employed convolution as its initial layer, generating the kernel that controlled the network operations.In the implemented model, a convolution layer with 32 neurons and a kernel width of 3 was incorporated to accommodate three inputs.This process settled the need to reshape the output because the layer inherently preserved the time axis.As a result of the analysis of the univariate series, the number of outputs remained one.After the evaluation, the model exhibited a mean absolute error of 0.9414 and an error loss of 3.1165.Although the predictions closely aligned with the corresponding labels, the high error value showed that the network still had some iota of overfitting based on the significant differences in the results.

LSTM modelling
The most recent advancement in neural network model was the LSTM recurrent neural network.LSTM networks proved to be particularly effective in tasks involving TSF due to their ability to handle and retain temporal dependencies.In this specific model, two LSTM layers were integrated, each comprising 128 neurons.Among the various activation functions available, the Tanh was selected because it was able to alleviate the vanishing gradient problem, achieving a faster convergence compared to alternatives such as ReLU and Sigmoid.Furthermore, the Tanh function produced both positive and negative outputs, facilitating state adjustments and less computationally expensive gradient computations.
Upon evaluation, this model produced a mean absolute error of 0.8282 and an error loss of 2.8247, representing the best error achieved.To ascertain the best-performing model, a crucial step involved comparing the validation errors.Table 1 showed the validation errors obtained for each model, clearly demonstrating that the LSTM outperformed the others by a significant margin, as it consistently yielded the lowest value.In the subsequent section, adjustments were made to the number of training features, and further fine-tuning of the LSTM model was carried out to optimise the performance of the neural network.

Model selection and tuning
To determine the superior ANN model, a comprehensive comparison of validation errors was conducted.Among the options, the LSTM neural network emerged as the most favorable choice for the dataset, exhibiting exceptional performance with the lowest validation error.Extensive efforts were devoted The tuning process involved multiple iterations of training and evaluation, aiming to enhance the predictive capabilities of the LSTM model.Strategies were employed to mitigate overfitting, including dataset reduction and the implementation of an early stopping callback function.These measures ensured that the model did not overly specialise in the training data and maintained its capacity for generalization.
The LSTM model consistently surpassed alternative approaches, demonstrating proficiency in capturing intricate patterns and relationships within the dataset.The inherent ability of this model to handle temporal dependencies made it an ideal choice for the specified data.The diligent optimization efforts applied to the LSTM further bolstered its performance and made it the most reliable and effective model for this task.The first step taken to reduce overfitting involved eliminating all rows except those obtained from the Spain dataset, leading to a smaller dataset.Additionally, 6 features were selected for the network training, namely today_deaths, today_confirmed, today_open_cases, today_recovered, day_of_week_sin, and day_of_week_cos.
After an extensive trial-error process lasting several weeks, the optimal configuration for the network was determined to be ANN model with the following specifications: i) 14 inputs and 7 outputs; ii) 2 LSTM layers, each with 128 neurons; iii) 1 dense layer; iv) a Tanh activation function; v) a sigmoid recurrent activation function used for the input/forget/output gate; and vi) 2 dropout layers with a rate of 0.4 to randomly set input units to 0 at each training step to prevent overfitting.
By using the refined LSTM model, exceptional results were achieved, as evidenced by a mean absolute error of 0.0296 and an error loss of 0.0013, as shown in Table 1.The accuracy of the predictions, as seen in Figures 1 and 2, closely matched the corresponding labels.These results, along with the minimal error values, indicated that the implemented neural network performed exceptionally well and appeared to be free from both overfitting and underfitting.The close alignment between predictions and labels suggested that the model effectively captured the underlying patterns and relationships within the dataset.The negligible error values further validated the competence and reliability of the model in accurately forecasting the desired outputs.Overall, the optimised LSTM proved to be a robust and suitable choice, showcasing its ability to deliver precise and reliable results while mitigating the risks of overfitting and underfitting.

Model validation
This section focused on analysing the predictions of the "today_deaths" attribute for randomly selected EU Member States.The objective was to assess the performance of the tuned ANN model when dealing with datasets of varying sizes due to differences in data availability among countries.The analysis aimed to determine whether any further adjustments were necessary to achieve satisfactory results.The correct distribution of data further reinforced the reliability of ANN model, which effectively captured and replicated patterns observed during the peak period of death, suggesting that the model was successfully trained on the available data.As a result, no immediate adjustments or modifications were deemed necessary to ensure optimal performance.The analysis and visualization of data from different countries within the EU Member States inspired confidence in the capability of the model to effectively predict the "today_deaths" attribute, even when the dataset size varies across countries.

ANN model applied to the data from France
When ANN model was applied to France's data, it yielded a mean absolute error of 0.1038 and an error loss of 0.0157.Although the result obtained was a bit high, a closer examination of Figures 3 and 4 showed a close alignment between several predictions and the corresponding labels.The validation error slightly exceeded the training error, primarily at the beginning of the dataset.These observations indicated that ANN model initially exhibited some overfitting but gradually improved its performance, leading to low validation and training errors.Meanwhile, the validation error being slightly higher than the training error suggested a degree of overfitting, the overall low errors indicated a well-fitted model.It was crucial to acknowledge that the proximity between the validation and training errors demonstrated the ability of the model to generalise and perform well on unseen data.Despite the moderately high mean absolute error, the combination of close predictions to labels, low validation error, and the similarity between training and validation errors indicated the effectiveness of ANN in capturing underlying patterns and relationships in France's data.This implied that the model learned and adapted to the dataset effectively, achieving a satisfactory level of fit.

ANN model applied to the data from Germany
When ANN was applied to the data from Germany, it demonstrated excellent performance.Specifically, the model achieved low training and validation errors, with the value of the validation being slightly higher than the training.The obtained mean absolute error of 0.0241 and error loss of 6.2180e-04 indicated superior accuracy and precision compared to the results obtained from the data of France.This strong performance as shown in Figures 5 and 6, where the predictions closely aligned with the labels, showcasing a robust fit between the model and the data.
The low errors in both training and validation, coupled with the proximity between the predicted and actual values, confirmed that the model effectively captured and replicated the underlying patterns and trends in the data of Germany.These results showed the successful learning and adaptation of ANN model to the dataset, leading to highly accurate predictions.In summary, the performance of ANN on the data of Germany surpassed that of France, reaffirming its capacity to accurately forecast the "today_deaths" attribute.The training of ANN with the data from Sweden produced excellent results, comparable to those achieved with that of France and Germany.The model achieved low training and validation errors, with the validation error slightly higher than the training error.Moreover, the obtained mean absolute error of 0.0139 and error loss of 2.5539e-04 indicated even better accuracy and precision compared to the previous results.The visual evidence in Figures 7 and 8 further supported the exceptional performance of the model on data from Sweden, as all predictions were closely aligned with the corresponding labels.This reinforced the ability of the model to accurately capture the patterns and trends in the data, producing highly precise predictions.
The combination of low training and validation errors, along with the close alignment between predictions and labels, highlighted the remarkable performance of ANN in the data.The results obtained outperformed those from France and Germany, indicating the superior fit and predictive capability of the model.In conclusion, ANN model performed exceptionally well on the data of Sweden, exhibiting low errors and producing predictions that were remarkably close to the actual values.These results underscored the efficacy and reliability of the model in accurately forecasting the attribute "today_deaths" for Sweden.In particular, the obtained mean absolute error of 0.0150 and the nearly negligible error loss of 2.8903e-04 showed the high precision and accuracy of the model.These results exceeded the performance achieved in the previous three Member States, highlighting the effectiveness of ANN model in accurately forecasting the "today_deaths" attribute for Italy.In summary, ANN model demonstrated outstanding performance on the data from Italy, with predictions closely aligned with the labels, low errors, and minimal variance.These results showed the exceptional capability of the model to accurately forecast COVID-19 related deaths in Italy.

CONCLUSION
In conclusion, a comprehensive comparative analysis was conducted in this study to identify the most suitable deep-learning model for predicting COVID-19 pandemic data.The primary objective was to develop powerful models that could assist healthcare professionals and authorities in effectively managing the current pandemic and future outbreaks.Consequently, the model yielded several notable outcomes and contributions.
Firstly, the concepts and techniques of deep-learning were successfully applied in health data prediction, showcasing the effectiveness of a range of deep-learning models, specifically designed for TSF in addressing the COVID-19 data prediction problem.The study provided valuable insights into selecting the most appropriate model for accurate predictions.Secondly, the importance of incremental adjustments in neural networks for forecasting tasks was highlighted.In this context, the study outlined the process of finetuning a neural network to enhance its performance in predicting the data.The factors influencing the performance of neural networks were elucidated and strategies were also provided to mitigate issues such as overfitting and underfitting, which could hinder accurate predictions.Lastly, the refinement of datasets used in training the neural network models was addressed.In this scenario, the effective techniques for cleaning and preprocessing datasets to eliminate incorrect, duplicate, or incomplete data were showcased.By ensuring the quality of the dataset, the study ensured that the neural network models achieved optimal performance in predicting COVID-19 data.
Overall, this work contributed significantly to the application of deep-learning models in health data prediction, particularly for COVID-19 forecasting.The results provided valuable knowledge and insights for healthcare professionals and policymakers involved in pandemic management and future outbreak preparedness.In Future work, the dataset could be expanded with additional attributes such as vaccination rates, immune population percentages, and other relevant data to further enhance model performance.The inclusion of significant attributes could mitigate overfitting and underfitting issues.Additionally, there is potential to focus on predicting specific attributes for a future time frame, such as deaths, recoveries, or active cases.Machine learning techniques, including the LSTM, could also be applied to assist in managing emergency departments and Intensive Care Units, aiding in predicting patient length of stay and resource allocation.Moreover, coupling the model with a Business Intelligence data warehouse had the potential to assist hospital managers in optimizing costs and defining key performance indicators for effective hospital management.

Table 1 .
The performance of time series methods