Methodology for detection of paroxysmal atrial fibrillation based on P-Wave, HRV and QR electrical alternans features

Received Oct 24, 2019 Revised Feb 19, 2020 Accepted Feb 25, 2020 The detection of paroxysmal atrial fibrillation (PAF) is a fairly complex process performed manually by cardiologists or electrophysiologists by reading an electrocardiogram (ECG). Currently, computational techniques for automatic detection based on fast fourier transform (FFT), Bayes optimal classifier (BOC), K-nearest neighbors (K-NNs), and artificial neural network (ANN) have been proposed. In this study, six features were obtained based on the morphology of the P-Wave, the QRS complex and the heart rate variability (HRV) of the ECG. The performance of this methodology was validated using clinical ECG signals from the Physionet arrhythmia database MIT-BIH. A feedforward neural network was used to detect the presence of PAF reaching a general accuracy of 97.4%. The results obtained show that the inclusion of the information of the P-Wave, HRV and QR Electrical alternans increases the accuracy to identify the PAF event compared to other works that use the information of only one or at most two of them.


INTRODUCTION
Atrial Fibrillation (AF) is the most clinically diagnosed cardiac arrhythmia, both in outpatients and hospitalized patients. Its prevalence and incidence increase with age reaching epidemic characteristics in senior citizens. The indicators of progress of paroxysmal atrial fibrillation (PAF) to a persistent or permanent one have not been fully identified, therefore, detecting an AF in its early form is important to avoid the risks of a stroke, heart failure and / or mortality [1].
The process of detecting an AF is performed manually by a cardiologist or electrophysiologist by interpreting the electrocardiogram (ECG) records. This process is highly demanding due to both the number of records to be analyzed and the fact that sometimes it is necessary to examine each beat individually to ensure the correct identification of the cardiac pathology. Thus, an automated method for classification and detection would improve the diagnostic and prevention of an AF [2][3][4][5].
To date, different authors have proposed methods that automate the detection of PAF. Some authors have reached a detection accuracy between 70% and 92% [6][7][8][9] using the characteristics of the P wave [10], others propose the use of heart rate variability obtaining an accuracy between 81.2% and 94.7% [7,[11][12][13][14][15][16][17][18], finally, [9] proposes the use of QR electrical alternation reaching an accuracy of 70%. According to this, the problem of appropriately detecting a FAP is not fully solved yet, due to results achieved by these methods are not definitive and can still be improved. So, in this paper a new methodology is proposed to address  [19] on the characteristics of the P wave, heart rate variability and QR electrical alternation. The accuracy obtained using this new method was 97.4%.
A PAF is characterized by irregular movement of the left atrium that prevents the proper blood flow into the circulatory system and also by a reduction of the time that the ventricles valves have to receive and send blood to the lungs. In an ECG signal, these two characteristics have an impact in the morphology of the P-Wave and in the distance between the P-Wave and the R-Wave see Figure 1, therefore, it is important to locate the characteristic points P-Onset, P-Offset, P Width and P Height, as well as the PR segment, heart rate variability (HRV) and QR electrical alternation to fully describe a PAF. Different authors relied on one or two characteristics for the detection of PAF as illustrated in Table 1. In this paper, unlike other works reported in the literature, it is proposed to use the information of the three characteristics to cover all the symptoms of the PAF and extract six relevant features to improve the detection rates. Sensitivity, specificity and accuracy were used as performance metrics for the evaluation of the methodology proposed here. Table 1 lists some works that address the same theme and the characteristics used.

RESEARCH METHOD
In the proposed methodology, a previously digitized ECG signal is received as input. The signal is processed in four main stages (Preprocessing, characteristic points extraction, features extraction, detection) and it is determined whether or not a PAF exists, as illustrated in Figure

Preprocessing
The first part of the algorithm is prepared to receive as input the lead II of a standard 12-lead ECG signal. Due to the variable nature of the sampling frequency of the ECG signal, an 1170Hz resampling is performed to ensure a standard frequency for the subsequent application of a low-pass finite impulse response digital filter (FIR) and to allow each of the characteristic points of the signal to be established more precisely. This stage comprises three steps: Resampling, moving average and filtering.

Resampling
The proposed methodology uses six features for the recognition of atrial fibrillation that are based on the morphology of the signal. Therefore it is very important to preserve the frequency content as well as the shape of the signal during processing. For this reason, it is required to be represent each beat by a sufficient number of points that ensure a good detection and a good feature extraction.
The height and width of the P-Wave are features that need a good morphological representation, thus, in this paper we considered using a resample frequency to ensure that the P-Wave has at least 50 samples.Considering that there are documented cases of patients with PAF at the age of 22 [20], the maximum heart rate ( ) considered in this methodology was calculated using the proposed (1). The result obtained was 168 beats per minute (BPM).
Subsequently, it was found that the duration of the P-Wave is 43 ms using on (2).
Finally, a resampling frequency ( ) of approximately 1170 Hz was obtained through (3) by relating the 50 samples that represent the P-Wave with its duration.

Moving average
ECG signals normally have a baseline wander that must be corrected to reference the voltage levels of the signal to a zero DC level. The moving average given in (4) is commonly used to do this which requires specifying a window size ( ). This paper proposes to obtain M based on the most common heart rate value present in the signal. To find this value, we obtain the frequency with the highest energy value in the power spectral density of the signal bounded between 60 bpm and 200 bpm. Therefore, M is defined as the inverse of the frequency with the highest energy value rounded to the nearest even value. This is shown in (5). Said frequency was obtained applying the fast Fourier transform (FFT) to the signal autocorrelation given in (6).

Filtering
An ECG signal is represented by (7), where, ( ) is the signal generated by cardiac activity with a frequency range of 2.5 Hz and 45 Hz [21], r (n) is electrical noise and white noise with frequencies greater than 45 Hz and b (n) is baseline noise with frequencies less than 2.5 Hz [22].
Noise ( ) was already removed using moving average in the last step. In this step, a low-pass filter with a cutoff frequency of 45 Hz was designed to remove the noise ( ).
The preprocessing stage is summarized in Algorithm 1.

Characteristic point detection
In the second stage of the methodology the peaks P, Q, R, S, P-Onset, P-Offset and Q-Onset were found on each beat of the ECG signal. These points, shown in Figure 3, will later be used to extract the features of the beat.

R-Wave peak
In this step, a moving window four times the size M found in section 2.1.2 was used to find the R-Wave peak. The window moves throughout the ECG signal finding peaks that exceed 0.6 times the maximum amplitude in the window and have a separation between them of at least 353 ms, that is, the heart rate does not exceed the maximum value chosen in this methodology of 170 bpm.

P-Wave peak
The P-Wave peak was found based on the location of two consecutive R-Wave peaks. As seen in Figure 4, The P-Wave peak is the maximum value found within a defined search area between 70% and 90% of the distance between two consecutive R-Wave peaks.

Q-Wave peak
The Q-Wave peak is characterized by a negative peak located just before the appearance of the R-Wave, for this reason, a derivative was used as a search method for this peak. According to the proposed (8), the value of the derivative is calculated on each sample one at a time before the R-Wave. This process is done until a derivative with a negative value is found as seen in Figure 5. In this paper we propose a distance of eight samples to be used in order to avoid small variations that could have a negative derivative in the path.

S-Wave peak
The identification of the S-Wave peak was carried out following a procedure similar to that used with the P-Wave peak. This time, the minimum value was sought within a defined area between 0% and 10% of the distance between two consecutive R peaks as is shown in Figure 6.

P-Onset
The P-Onset point is defined as the sample where the P-Wave starts and ideally has a value of 0 mV. This point was found by evaluating each one of the samples prior to the P-Wave peak one by one until the condition set in the proposed (9) was met. This equation considers the fact that, in practice, the P-Onset has a positive value higher than the baseline, therefore, a value of 0.15 times the amplitude of the P peak was used to find it.

P-Offset
This characteristic point is defined as the sample where the P-Wave ends. To find this point, we proceeded in a similar way to the method used to find the P-Onset with the difference that the samples evaluated are located after the P-Wave peak.

Q-Onset
Q-Onset is the sample where the Q-Wave begins. To find this characteristic point, a similar method used in section 2.2.3 was considered. Each of the samples before the Q-Wave peak is calculated one by one on the derivative described by the proposed (10) until a positive value is found. In this case, a sensitivity greater than that required to find the Q-Wave peak is required, thus the distance was reduce from eight to four samples.
The characteristic point detection stage is summarized in Algorithm 2.

Features extraction
Once the characteristic points have been identified, the six features presented on the third stage of the methodology described in Figure 2 are extracted for each beat of the ECG signal. The first three features P-Wave height, P-Wave width and PR segment are the magnitudes of the P-Wave peak, the difference between P-Offset and P-Onset, and the difference between Q-Onset and P-Offset respectively. As for the fourth feature P-Wave Area, it is defined as the area under the curve between P-Onset and P-Offset. Considering that the ECG signal is discrete, a trapezoidal numerical integration is used as an approximation to the integral of the signal between these two points. The (11) describes this condition.
The fifth feature called Heart Rate Variability (HRV) is the number of beats per minute (bpm) that would be generated according to the distance between two consecutive R-Wave peaks. The (12) describes this process.
In (12), is the beat number.
( ) is the location of the R-Wave peak. is the resampling frequency, 1170 Hz in this case. The sixth and last feature called QR electrical alternans is defined as the difference between the amplitude of the R-Wave peak and the Q-Wave peak. The features extraction stage is summarized in Algorithm 3.

Detection
Detection is the final stage of the proposed methodology. To determine the presence of a PAF in the ECG, a feedforward neural network with two hidden layers each with 10 neurons was used as a classifier [23]. This neural network, whose training was carried out using 60% of the information in the database shown in Table 2, to identify the presence or not of a PAF in each beat of the ECG.

Performance metrics
Sensitivity (SN), specificity (SP) and accuracy (ACC), shown in (13)(14)(15) respectively, were calculated since these are the most widely used performance metrics to assess the probability of success of a classifier [24]. Table 3 shows the results of these metrics in different works reported in the literature.

RESULTS AND ANALYSIS
To evaluate the proposed methodology, the Atrial Fibrilation (afdb) and Normal Sinus Rhythm (nsrdb) databases from Physionet [25] were used. Each one has ECG signal samples from both sick and healthy patients. Each signal is processed using the methodology described before. Figure 7(a) shows an original ECG signal from the database, while Figure 7(b) shows the signal after preprocessing. Finally, Figure 7(c) shows the signal with its characteristic points obtained.
The extraction of characteristics was applied to each of the records in both databases. We obtained six features of a total of 99,002 beats as illustrated in Table 2. To ensure the linear independence of the features, the degree of correlation between each of them was determined through the correlation matrix. As it is shown in Table 4, the relation between the six features is low in all cases except between P-Wave area, P-Wave height and P-Wave width which is moderate. These results ensure that the features obtained through the proposed methodology are suitable for the training of a neural network.
The PAF was detected through a feedforward neural network whose training data corresponded to 60% of the information provided by the 99,002 beats obtained before. The network was trained on 10 different occasions and was obtained the SN, SP and ACC in each training. The calculations of the maximum, minimum, average and standard deviation of each performance metric are shown in Table 5.
A comparative analysis of the performance metrics between different classifiers used in similar works and the proposed methodology was done. The results are shown in Table 3. The proposed methodology obtained a minimum SN of 96.4% that is higher than the others. On the other hand, the SP reached a maximum value of 98.1% being surpassed only by [8], however, the ACC exceeds all the reported works even with its minimum value of 96.3%.

CONCLUSION
To date, different authors have proposed methods that automate the detection of PAF using the characteristics of the P wave, heart rate variability or QR electrical alternation. The accuracy reached by these methods vary between 70% and 94.7%. Thus, the problem of appropriately detecting a FAP is not fully solved yet, due to results achieved by these methods are not definitive and can still be improved.
This paper proposes a methodology to identify the presence of a PAF in patients by analyzing their ECG. The methodology includes both the identification of the characteristic points of the ECG signal and the methods to extract six features that allow a PAF to be detected through a classifier. The results obtained show that the inclusion of the information of the P-Wave, HRV and QR electrical alternans for the extraction of features increased the accuracy in the detection of a PAF to 97.4% on average. The SN obtained was higher than that obtained in other works, achieving at least a result of 96.4%. The SP was similar to the results obtained by the works consulted.The results obtained serve as the basis for the future implementation of a methodology that allows predicting the occurrence of a PAF in a given period of time.