A real-time fault diagnosis system for high-speed power system protection based on machine learning algorithms

Received Mar 31, 2020 Revised May 27, 2020 Accepted Jun 17, 2020 This paper puts forward a real-time smart fault diagnosis system (SFDS) intended for high-speed protection of power system transmission lines. This system is based on advanced signal processing techniques, traveling wave theory results, and machine learning algorithms. The simulation results show that the SFDS can provide an accurate internal/external fault discrimination, fault inception time estimation, fault type identification, and fault location. This paper presents also the hardware requirements and software implementation of the SFDS.


INTRODUCTION
To prevent equipment damage as well as personal injury and to ensure reliable, stable, and affordable electric energy supply, extra-high voltage networks should be fitted with efficient high-speed fault protection relays and schemes. Indeed, inefficient protection relay systems disturb the regular operation of electric power systems and increase the risk of cascading blackouts [1]. Consequently, a smart fault diagnosis system should be developed to monitor and control the operation of a classical relay, or to be itself used as a smart relay. This system can also be used for the ex-post analysis of oscillographic fault records to make a report on what might have happened during the fault condition. Thus, the purpose of a smart fault diagnosis system is to provide precise information on the external/internal fault discrimination, fault inception time, fault type, and fault location.
The kinds of fault diagnosis systems that have been suggested in the literature are mainly based on classical approaches and artificial neural networks. Those that are based on classic circuit [2,3], symmetrical theories [4][5][6], digital wavelet transform (DWT) [4][5][6], and based on artificial neural network (ANN) [8,9]. The symmetrical theories and digital wavelet transform [4][5][6] require phasor estimation whose correctness depends highly on the efficiency of non-power frequency filtering [7]. The systems based on ANN [8,9] give satisfying results when it comes to classification tasks (fault type) but bad results when it comes to regression tasks (fault location). Moreover, ANN is challenging to interpret (it is hard to understand the reasons behind the results they give), structure (there is no specific procedure to determine the architecture of artificial neural networks) and train (computational complexity increases with the number of neurons and layers). As depicted in Figure 1, the online processing starts with the acquisition of digital time-domain voltage and current measurements from instrument transformers through an A/D converter. After that, the superimposed quantities are extracted from the obtained measurements to be transformed into Clarke modal components first, then these modal components are filtered by a differentiator-smoother filter before being processed by the peak finding algorithm when a fault event is detected. Then when the peak of the filtered superimposed quantities is detected; the impedance angle is estimated from the non-filtered superimposed quantities to be used as a discriminator for internal faults. In the case where the detected fault is internal, the RMS values of the non-filtered superimposed quantities are calculated and used as inputs of the fault classifier and the fault locator. The fault inception time is estimated from the information obtained about the index of the filtered superimposed quantities peak and the equations derived from the traveling wave theory.

THEORETICAL BACKGROUND
The following section gives more insights into the different techniques cited in the previous section during the description of the SFDS.

Traveling waves and superimposed quantities
As shown in Figure 2, when a fault occurs on a transmission line, it produces transients that propagate at a pace approaching light speed in the form of traveling waves. As soon as the wave arrives at the measurement point, the transient voltage and current are superimposed to the steady-state (or pre-fault) voltage and current, respectively [10]. Since this theory holds only for single-phase transmission lines, its extension to three-phase transmission lines requires the transformation of the measured voltages and currents from the phase domain ( = { , }) into the modal domain ( = { , 0}). In the latter domain, each mode is independent of the other and can consequently be treated apart as a single-phase transmission line. For this purpose, we used the Clarke transform that is the most suitable for real-time applications since it uses only real numbers and not complex numbers like the Fortescue transform [11,12].  (6) and (9) of a power system Consequently, we can derive the following expressions at buses (6) and (9) where:  stands for bus (6) and bus (9) and for the -mode, the -mode, or the 0-mode of the Clarke transform.  represents the propagation speed of the mode traveling wave.  represents the sampling frequency of , and = 1,2, … is the ( ) ℎ sample.
 is the length of the line of concern and is the distance from bus (6)

6125
The Clarke modal components of the superimposed quantities are obtained from the phase components of the measured signals and the pre-fault signals by the following expression: Supposing that both ends are synchronized, the wave arrival times 6 and 9 indicated in the lattice diagram are measured with respect to the mode, and with respect to a common time reference. Thus, we can compute the fault inception time using the following relation:

Differentiator-smoother filter
For the filtering process, we use a simple differentiator-smoother filter that has a square data window, as depicted in Figure 3(a) [13]. Considered over 20 , the abrupt change in the superimposed quantities, when the traveling wave reaches the measurement point, can be captured by this filter. In other words, this filter can capture the transition of the superimposed quantities from the zero value to the maximum nonzero value. Indeed, as shown in Figure 3(b) this differentiator-smoother filter has a triangular step response that allows the detection of the traveling wave arrival time through the time stamping of the triangle peak. As presented in Figure 3(c), we chose the amplitude of the square data window so that the amplitude of the triangular output signal matches the magnitude of the step change in the input signal. We can express the relationship between these elements with respect to causality, linearity, and time-invariance conditions using the convolution product:

Internal/external fault discrimination criterion
According to [14,15], when a fault occurs in the forward direction of a relay, the modal superimposed voltages Δ [•] and currents Δ [•] have different polarities. That means that if an internal fault is assumed to happen on our line of interest, we should have the following properties at buses (6) and (9). Thus, we deduce that for a fault to be internal, the impedance angle at both ends of the line should be close to one. That is: In real-time applications, we propose to estimate the impedance angle using the formula below that is derived from the dot product and the active power definition: where − ≝ { } stands for the -mode or the -mode, is the number of samples considered and ̂ is the estimated impedance angle at bus b. We also propose for the internal/external fault discrimination the following criterion: The angular threshold is to be determined empirically.

K-nearest neighbors (KNN) algorithm and fault classifier
Under the assumption that similar inputs share the same properties, for a D-dimensional query input, the KNN algorithm assigns the most common label among its most similar training inputs [16,17].
the training dataset, ( , ) the distance between two D-dimensional points = [ 1 … ] and = [ 1 … ] , T the matrix/vector transposition, and * the set of the K nearest neighbors of * . We can formally define * as follows: That is, every point belonging to and not to * is at least far away from the query point * by a distance equal to that of the furthest point in * . The classifier ℎ() can be defined as the hypothesis that returns the label of the highest occurrence in * . Formally we have, where () means to select the most frequent label. Concerning the distance function on which KNN relies, there is the Minkowski metric that is the most commonly used to reflect the closeness between a query point and the inputs of the training set, and by extension, the similarity between their labels [18]: As shown in Figure 4, the fault classifier is composed of two stages. The first stage contains three functions based on the KNN algorithm and each one of them indicates whether or not its corresponding phase is involved in the fault condition. The second stage indicates, based on a comparison with a threshold value to be determined empirically, whether or not the ground is involved in the fault condition. . As part of the gaussian process (GP), we assume that the training and test outputs are drawn from the following joint Gaussian distribution [19,20]: where is the column vector , is a column vector containing zeros, and * is a column vector containing * zeros. , * and * * are kernel matrices defined by a kernel function as follows: ( ) = ( ( ) , ( ) ) + 2 , where is the Kronecker delta, ( * ) = ( ( ) , * ( ) ) and ( * * ) = ( * ( ) , * ( ) ) . The supervised learning using GPs is based on the idea that the points with similar inputs ( ) , naturally have close output values ( ) . This similarity is expressed by the covariance function that can be defined by different kernel functions. Among the various kernels that exist, we can mention, for instance, the Rational Quadratic kernel (RQ kernel) and the squared exponential kernel (SE kernel) that are defined by equations (16) and (17), respectively: where > 0 is the characteristic length scale, is the signal standard deviation, is a positive-valued scale-mixture parameter, 1 = ln( ), 2 = ln( ), 3 The estimated test outputs are the posterior mean vector component (i.e. ̂ * (1) = ( * ) 1 , … ,̂ * ( * ) = ( * ) * ). As shown in Figure 5, the inputs of the fault locator, Δ and Δ , are the root mean square values of the time-domain discrete superimposed voltages and currents Δ [•] and Δ [•] measured at in the phase domain. The fault locator gives the estimated fault location ̂ with respect to bus (6) and it is  composed of ten GPs, namely  ,  ,  ,  ,  ,  ,  , , and . Each of these GPs predicts the fault location and is activated when the fault type to which it is related is involved.

HARDWARE REQUIREMENTS AND SOFTWARE IMPLEMENTATION OF THE SFDS
It should be noted first that all SFDSs have a similar structure, but for illustration purposes, we will consider only the 6 shown in Figure 6 and its relation to the 9 . That being said, the SFDS should meet the following main hardware requirements:  CPU where to implement the SFDS, GPS receiver for data sampling synchronization, and high-speed communication framework as described in [21].  recorded the 60 steady-state signals over 3 cycles. The length of the DFR buffer should be in the order of 0 6 = × 1 6 , where = ⁄ ≥ 1 and is the sampling rate of the DFR buffer. It is advised to choose so that is an integer.  Data Signal Processing Buffer: We conceptually split it into five buffers. The Filtering Process buffer has the same structure as the SU buffer, except that the length is equal the filter window length. Since our differential-smoother filter covers a time period of 20 , and the SU sampling rate is = 1 , the length of the Filtering Process buffer is then equal to 2 6 = 20. The Data Recording and Saving Process buffer has also the same structure as the SU buffer, except that its length is equal to 3 6 = 640. This length has been determined from the time needed for a refracted  Wave to go back and forth between bus (6) and bus (4) at the speed of light. The Feature Extraction Process (FEP) buffer contains the RMS values of the phase-domain superimposed currents and voltages of bus (6), the RMS value of the zero superimposed current of bus (6) and the RMS values of the phasedomain superimposed currents and voltages of bus (9). The remaining buffers are used for the fault detection process.  ROM6: During the offline conception of the SFDS, the considered power system should be simulated in normal conditions, and the phase-domain steady-state voltage and current signals should be recorded and saved in the Steady-State Signals buffer that has the same structure as the SU buffer. These signals will be used during the online process to calculate the superimposed quantities. The second buffer contains the coefficients of the differentiator-smoother filter. The software implementation of the SFDS is depicted in Figure 7, and commented as follows:  Step (1)


Step (3.4): Checks whether or not 0 6 has reached the end of 0 6 .
 Step (32): Launches the Feature Extraction Process (FEP) that is demonstrated in Figure 11:  Step (32.1): Indicates the beginning of the FEP.  Step (32.2): Checks whether or not the flag 6 is set to one.  Step ( Step (32.6): Ends the FEP, and goes to step (33) where the fault classifier (see Figure 4) will identify the type of the internal fault (step (34)).  Step (35): Requests the RMS values of the phase-domain superimposed quantities from the FEP buffer 4 9 .  Step (36): Once received, these RMS values are stored in their corresponding position in 4 6 (see Figure 6). Step (37) and step (38): The fault locator (see Figure 5) estimates from the values saved in 4 6 the fault location ̂.  Step (39): Ends the online process.
It should be noted that the flags, indices and loaded data are handled by the CPU internal memory. Figure 11. The fault extraction process (FEP) flowchart

SIMULATION RESULTS
In order to assess its performance, we implemented the SFDS during the simulation study of the WSCC 3-machines 9-bus test system (see Figure 14) [22], and we modeled and evaluated this latter using the Matlab/Simulink software package. Since we were concerned with the 180 Km transmission line joining bus (6) and bus (9), the training and test data contained the phase-domain voltages and currents measured by CT6, VT6, CT9, and VT9, essentially. The training and test data were randomly and uniformly generated under various fault conditions. Indeed, for all type of faults (AG, ABC), the fault resistance (in ohms) was drawn from (0.001 ; 40), the fault location (in kilometers) was drawn from (1 ; 179), and the fault inception angle (in degrees) was drawn from (0 ; 360), where ( ; ) stands for the uniform distribution across the real interval [ , ].
To make sure that in the case of an internal fault the discriminator triggers a true alarm while considering = 90° and respecting the criterion given by equation (11), we evaluated the discriminator using a dataset containing about 1000 estimations of the impedance angles corresponding to buses (6) and (9).
by simulating internal faults under the conditions described in the paragraph above, we assessed the performance of the discriminator using the sensitivity indicator given by: where is the number of samples that triggered a true alarm, and ( = 1000) is the total number of samples generated during internal fault conditions. It appears from Figure 12 that the peak detection time at bus (6), 6  The development of the first stage of the fault classifier of the 6 went through the model selection process to determine the suitable input features as well as the suitable output features (labels), the number of nearest neighbors to consider, and the adequate distance metric. For this process, we used a training set of about 5000 instances and the best setting that we have found is listed in the Table 1: Table 1 As explained in section 3.4, the training set will also be used by the classification functions ℎ () , ℎ () , and ℎ () to determine the closest point among the inputs of the training set to a new test input. So to evaluate the performance of these classifiers in the context of the setting described in Table 1 Concerning the second stage of the fault classifier of the 6 , we have found the following relation: ISSN: 2088-8708  Relation (27) can also be depicted in Figure 13, where the red samples represent ground faults and the blue ones represent faults where the ground is not involved. Figure 13. Samples with the ground involved vs samples with the ground not involved In like manner, the conception of the fault locator went through a 10-folds cross validation selection process that used about 500 examples per GP to determine the most suitable input features and kernel functions for the different GPs ( , , … , ) shown in Figure 5. At the end of this process, we found that Δ 6 Figure 5, and processes (32), (35) and (36)). Moreover, this model selection process showed that all GPs had the RQ kernel (see equation (16)) as the most suitable kernel function, except which had the SE kernel (see equation (17)) as an appropriate kernel function. After the selection process, the training of the GP took place.
Training the GPs consisted of using the training data to tune the kernel function parameters by maximizing the marginal log likelihood using the quasi-newton optimization method [23]: where () is the density function of the multivariate Gaussian distribution, = [ (1) … (500) ] the column vector of the outputs (distances) of the training set, = [ (1) … (500) ] is the design matrix, the column vector = [Δ 6 Δ 9 ] ∈ ℝ 12 , and stands for or . In order to assess the performance of the trained GPs, we generated for each one of them a test set containing 100 instances The ID identifies the fault type (and by extension the considered GP) as shown in Table 2. The performance of these GPs is given by the Average Relative Error (ARE) in fault location: , and ̂ is the estimated output (distance). The obtained results are listed in the Table 2 below.  (23), (24), (26), and in Table 2 show clearly that the proposed smart fault diagnosis system is accurate and reliable. In fact, the value of the discriminator sensibility obtained in equation (23) shows that the SFDS is highly reliable since the percentage of internal faults that are correctly detected is 95 %. Moreover, the value of the obtained in equation (24) indicates that the difference between the fault inception time predicted by the SFDS and the observable fault inception time is, on average, of the order of 0.0164 . This small average difference proves that the proposed SFDS estimates with high accuracy the fault inception time. Furthermore, the SFDS can correctly classify all types of faults with an accuracy of about 97.7 % according to relation (26). This result concurs with the classification accuracies stated in [24]. It should also be mentioned that the very small fault location average relative errors listed in Table 2 show that the SFDS can localize faults with high accuracy regardless of the type of fault involved. In addition, the errors listed in Table 2 are in good agreement with the fault location errors established in [24]. It should be noted that the proposed SFDS is fast since it achieves the fault classification task within 30 − 40 and the tasks of fault detection, internal/external fault discrimination, and fault inception time estimation within 3 − 4 [25,26]. Furthermore, the proposed SFDS achieves the fault location task within 330 − 400 . When the SFDS is used in real-time, the data transmission delay between the master terminal and the slave terminal should be added to these achievement times and should remain below 10 [27]. Figure 14. WSCC 3-machines 9-bus test system

CONCLUSION
This paper presents a smart fault diagnosis system (SFDS), which is a mixture of various and complementary time-domain techniques. Indeed, the peak detection process uses the differentiator-smoother filter. The internal/external fault discrimination process relies on the impedance angle estimation. The fault inception time estimator uses results from the traveling wave theory. The fault classifier and the fault locator are based on the KNN algorithm and the GP, respectively.
The proposed system can be used for online as well as for offline applications. The extremely lowtest errors obtained in the simulation results, as well as the very short time windows used to analyze the measured signals and to extract the various input features confirm that the SFDS can provide an accurate, reliable and fast fault diagnosis.