Data quality processing for photovoltaic system measurements

The operation and maintenance activities in photovoltaic systems use meteorological and electrical measurements that must be reliable to check system performance. The International Electrotechnical Commission (IEC) standards have established general criteria to filter erroneous information; however, there is no standardized process for the evaluation of measurements. In the present work we developed 3 procedures to detect and correct measurements of a photovoltaic system based on the single diode model. The performance evaluation of each criterion was tested with 6 groups of experimental measurements from a 3 kWp installation. Based on the error of the 3 procedures performed, the most unfavorable case has been prioritized. Then, the reduction of errors between the estimated and measured value has been achieved, reducing the number of measurements to be corrected. For the clear sky categories, the coefficient of determination is 0.9975 and 0.9961 for the high irradiance profile. Although an increase of 2.5% for coefficient of determination has been achieved, the overcast sky categories should be analyzed in more detail. Finally, the different causes of measurement error should be analyzed, associated with calibration errors and sensor quality. This is an open access article under the CC BY-SA license.


INTRODUCTION
The use of photovoltaic systems (PVS) has increased in recent years, representing an installed capacity of 848 GW worldwide [1].In general the operation and maintenance of this technology incorporates meteorological and electrical measurements [2], [3].The operation and maintenance activities, as well as performance studies are based on high quality measurements [2].The monitoring of meteorological and electrical variables is essential in the modeling of electrical systems, allowing the predictive evaluation, optimization, forecasting and other alternatives for the correct operation and maintenance of the PVS [3]- [11].The scientific literature shows several forecasting studies for photovoltaic energy production.These use large volumes of data, but in general the quality of the historical data used in the training model is not analyzed [7], [12]- [14].
For forecasting, post-processing allows an improvement of the results through autocorrelation [12]; however, it fails to identify and correct erroneous measurements.The standards publihed by the International Electrotechnical Commission (IEC) related to PVS indicate a generic measurement quality control procedure Int J Elec & Comp Eng ISSN: 2088-8708 ❒ 13 [15], [16].In general, the data quality process has been applied through the comparison between simulation results and measured values also not expensive sensor could be used to test measurements [17]- [19].Other authors have considered the identification of data outside specific range, duplicate values, unmeasured values and erroneous information [13], [20].
The data quality process should consider various criteria to filter information adequately, based on the maximum and minimum limits of each variable of interest [15], [16], [6].In the case of irradiance values, acceptable values were established between 50 and 1,300 W m −2 , and for PVS power between 10% and 130% of the value in standard condition [13].The irradiance limits can be based on extraterrestrial irradiance values and the clear sky index [20].The PVS power level depends on the irradiance level, then the solar position can be obtained from meteorological information and can be used to estimate the available power, also commercial software process data for power production [21], [22].Similarly, the linear relationship between power and irradiance allows the identification of outlier measurements [2], [23].On the other hand, the comparison of different measurements of the same variable allows the identification of deviations, in some cases satellite irradiance measurements incorporate high uncertainty [2].As part of data quality, autocompletion, interpolation, regression and autoregression criteria have been established [2], [24], [25].Autocompletion can also be developed through filters and inferential techniques [6].
Given the above background, there are several ways to perform a data quality process.It is essential to have high quality information for the operation and maintenance, as well as for the PVS performance evaluation [2].In this context, it is essential to carry out a validation of all measurements referred to the PVS based on the basic criteria according to IEC-61724 [15], [16].The present research is focused on developing 3 criteria that allow a data quality process according to IEC.These procedures are provided to identify and correct meteorological and electrical measurements based on the single diode model (SDM).

EXPERIMENTAL SYSTEM 2.1. The photovoltaic system
The unit of analysis in this research corresponds to the 3 kWp photovoltaic installation.This installation is located on the rooftop of the Renewable Energy Laboratory of the Faculty of Electrical and Electronic Engineering, at the Huancayo campus of the National University of Central Peru.Table 1 shows the electrical parameters of the photovoltaic panel (MAXPOWER CS6U-325) according to the manufacturer Canadian Solar; this installation has 10 panels connected in series.Figure 1 presents the configuration of the photovoltaic system including the sensors and equipment used in this research.The focus of the installation is dedicated to the analysis of the direct current (DC) system in the PVS, the voltage and current measurements on the input side of the inverter, as well as the meteorological variables.

Measuring equipment
The PVS has an orientation of 12 • relative to the horizontal plane, this value corresponds to the latitude of the city of Huancayo.Likewise, the pyranometer has the same orientation.These installation conditions are based on compliance with IEC-60904 and IEC-61724 [16,26] standards.Table 2 shows the equipment that collected the measurements, specifying the manufacturer, the purpose of the use of such equipment corresponding to the variables measured, the units of measurement and the acronym for the identification of the measurements presented in this work.On the other hand, the DC voltage and current measurements are supplied by the inverter.The ambient temperature measurement was recorded by the PT1000 sensor, the equipment is installed Data quality processing for photovoltaic system measurements (Jose Galarza)

❒
ISSN: 2088-8708 in a ventilated and shaded [2] environment.The configuration of the data logger allowed measurements at 5-minute intervals during 24 hours.
Figure 1.Components of the PV system, including sensors and inverter

Photovoltaic system model and correlation between variables
The scientific literature reports different PVS models used for the estimation of photovoltaic current (I pv ), all based on the available irradiance and temperature values.The SDM model has been widely used, due to its low complexity, acceptable accuracy and less computation time [10], [27]- [29].Figure 2 illustrates the SDM model used in this work, also the electrical variables associated with the model are indicated.The SDM model represents a linear relationship between I pv m and G m according to (1).Based on the resulting values of V pv m and I pv m , the PVS power presents a characteristic relationship with G m according to (2).The PVS temperature has been estimated from nominal operating conditions and ambient temperature conditions [29], [2], [30].For the present investigation, the analysis of temperature change in the conditions of partial or total shading of the photovoltaic modules has not been considered [31]- [33].In   In the present investigation, 6 case studies were evaluated for different irradiance values.These case studies were classified based on the irradiance variability condition, according to the clearness index (CI) and variability index (VI) estimation procedures.The classification is defined as: clear, overcast, mild, moderate, and high [34], [35].The VI index was determined after comparing the measurement with the ideal cloud-free irradiance level provided by Sandia National Laboratories [36].

3.
PROPOSED DATA QUALITY APPROACH In the present work, the process of detection and correction of faulty measurements has been carried out.The measurements were filtered based on the maximum and minimum values of the variables of interest according to IEC-61724 [16].All measurements were verified through 3 hypothetical situations based on the comparison of results from the SDM and the experimental measurements of voltage, current, temperature and irradiance.An explanation of each step of the proposed procedure is given below.

Single diode model parameter
The simplest model of the PV cell has been used to simplify the model parameter estimation procedure.Based on the SDM, the ideal diode factor (A n ) has been estimated as a function of the PVS parameters described in Table 1.Formula (4) allows estimating the current supplied by the PVS, the resolution of the nonlinear equation is based on Newton Raphson iterative process with a tolerance of 1e-6 and the initial value considered for A n equals to 1.

Procedure 1
The procedure considers the scenario with observation error for DC current measurements, while irradiance, DC voltage and temperature measurements have a minimum error.The procedure considers the estimation of the variable with observation error from the other available system variables, in this situation those corresponding to irradiance, DC voltage and temperature.The estimation of I pv depends on these three variables, the estimation is performed based on (1) according to the PVS model [29].It must be considered that the estimated current value is located on the I-V curve, then it is possible for each point of this curve, to identify the voltage value supplied by the PVS, these curves obey the behavior of the PVS according to the SDM and the parameters that define its operation.Another alternative for current estimation corresponds to the use of PVS curve tracers, under a standardized procedure and a controlled radiation and temperature level, it is possible to obtain the V-I curves.This procedure is performed with specialized equipment that allows characterizing the PVS in its different operating conditions.

Procedure 2
The procedure considers the scenario with observation error for irradiance measurements.From the hypothetical case, current, voltage and temperature measurements are considered with minimum error.In this scenario, the irradiance is estimated from the current and voltage measurements, both in DC, and the temperature.The relationship of variables is based on the SDM, the application of the simplified model allows less complexity in the estimation of irradiance.Based on (4), the dependence of I ph and irradiance must be considered.In the SDM, such current is modeled as a current source whose value is proportional to the level of irradiance incident on the PVS [27]- [29].The SDM allows the estimation of a variable from a set of electrical data, bearing in mind that meteorological variables are more sensitive to variations in weather conditions.In the case of radiation, it can present very high values or peaks due to the incidence of radiation in the passage of clouds, even higher values than those obtained in a clear sunny day [37].

Procedure 3
The procedure considers the scenario where the DC voltage measurements have an observation error, while the irradiance, current and temperature measurements have a minimum error.From these measurements, the voltage supplied by the PVS has been estimated.For this procedure, the SDM allows direct estimation of this voltage through in (5).The difference of currents shown in ( 5) is conditioned by the logarithmic term.The operating principle of the PVS is based on the generation of current I ph as a function of the irradiance level.On the other hand, the I pv corresponds to the current injected into the system.Consequently, I ph must Data quality processing for photovoltaic system measurements (Jose Galarza)

❒
ISSN: 2088-8708 be greater than I pv to obtain a real and logical result for SDM.The group of measurements that did not meet this criterion were corrected by the estimated irradiance value, then I ph was updated to estimate V pv .The procedures described above have been analyzed with PVS measurements, Figure 3 shows the flowchart and summary of each of the procedures applied in the present work.
Figure 3. Proposed approach to data quality

RESULTS AND DISCUSSION 4.1. Initial analysis of measurements
Based on the experimental measurements of irradiance, the measurements of 6 case studies from the year 2022 were analyzed.The case studies include measurements of the variables current, voltage, temperature, and irradiance with an interval of 5 minutes.All variable are according to the information in Table 2.The analysis time for each variable was from 6:45 hrs to 17:45 hrs, 133 measurements were then processed for each case.For each irradiance classification, corresponding to clear, overcast, and high, 2 case studies were analyzed.Table 3 shows the VI and CI indexes based on the statistical procedure of the measurements [34], [35].Table 3 shows the classification information for all the cases studied.In case 3, the VI index lower than 2 is achieved when the two measurements with the highest value are omitted in the calculation.Figure 4 presents the different irradiance profiles analyzed in this research.Figure 4

Procedure 1: based on current estimation
According to section 3 and the flowchart presented in Figure 3, the current estimation and the evaluation of the observation error were performed through the relative error.Figure 5 shows the errors between the estimated value and the measurements for each case evaluated.For values close to 1,000 W.m −2 the lowest errors were recorded, it is showed in Figure 4(a) and Figure 5(a).In general, for the evaluated cases, the highest error was determined for a low level of irradiance, approximately for values below 300 W.m −2 according to Figure 4(b) and Figure 5(b).In some cases, the error value of 100% was exceeded and consequently faulty measurements were quickly identified.In the overcast category, based on Figure 4(c) and Figure 5(c), it is observed that the error of 30% or more is found in situations with radiation levels below 500 W.m −2 .Based on the procedure described in section 3, the analysis of the 6 case studies was accomplished with procedure 2. Figure 6 shows the results for all the cases evaluated.In general, it is observed that there is an underestimation of irradiance according to Figure 6(a), Figure 6(b), and Figure 6(c).Based on the irradiance values recorded, the measurements with the highest error have low irradiance levels.Table 3 shows that after applying procedure 2 to the measurements, there is a greater difference with respect to the other cases.Basically, because this procedure involves estimating the irradiance, based on the operating parameters of the PV system.For these evaluated cases, 100% of the measurements have been obtained and should be corrected, with the need to specifically evaluate the cases with low irradiance levels.
Data quality processing for photovoltaic system measurements (Jose Galarza) The Figure 7 represents all results for this procedure.In the case shown in Figure 7(a), the largest error occurs after 14:00 hours, not necessarily at a low radiation level.In the case of Figure 7(b), the lowest error value is recorded at peak sunshine hours.In the cases of high variability, the highest error occurs in low irradiance scenarios according to Figure 7(c After the process of identifying the erroneous measurements, the criterion that considers the largest error among all the procedures and which is greater than 5% was applied (the same approach was used for three procedures).Table 3 shows the number of measurements obtained for each procedure.The total value corresponds to the number of corrections to be made when each procedure is corrected separately.In the case of irradiance with clear category (cases 1 and 2), for each procedure about 50% of the measurements will be corrected.As shown in the previous section, this represents the cases with the lowest error.In the overcast category (cases 3 and 4) in procedures 1 and 2, there are more than 90% of measurements to be corrected.Finally, in the high category (cases 5 and 6), there is an average of 65% of measurements to be corrected.Regarding the type of procedure, in all cases it has been detected that procedure 3 allows a better estimation, reducing the number of measurements to be corrected.The total for each case corresponds to the sum of the measurements to be corrected by the 3 criteria; however, the same measurement can be corrected by more than one criterion.Therefore, the procedure with the highest error has been prioritized.

Application of the 3 procedures
After analyzing each procedure separately and evaluating the number of results to be corrected, as shown in Table 3, the three procedures were applied simultaneously to the whole group of measurements.Unlike Table 3, the numbers of measurements shown in Table 4 are not the same for each procedure.From the results shown in Table 4, the reduction of measurements to be corrected for each procedure can be observed, presenting a better option.In general, for the various cases evaluated, the number of corrected measurements was reduced by more than 50%.For cases 3 and 4, 100% of the measurements were corrected.In cases 1, 2, 5 and 6, on average 40% of measurements present a coherent relationship with the SDM.Based on the comparison of results by procedure, on average more than 45% of the measurements were corrected by procedure 1.Thus, we conclude that current measurements are more sensitive to present error with respect to irradiance measurements.Considering the results shown in Table 4, the correlations between variables were evaluated according to section 3. Table 5 shows the relationship between I pv m and G m according to (1) called relationship 1, and relationship 2 for (2).The relationship 1 and 2 are represented by the ( 1) and ( 2) respectively.Based on the results shown in Table 5, cases 1 and 2 present a high level of correlation, because they ocurred in sunny day conditions.However, after correcting the measurements with the global approach, the correlation is increased to 0.9975 for relationship 1 and the value 0.9942 for the relationship 2. In cases 3 and 4, the coefficient of determination is increased to 2.5% for relationship 2 and 2% for relationship 1, these represent the cases with the highest increase.In cases 5 and 6, this indicator is increased by 0.5%.But, unlike the first four cases, the determination index takes values of 0.96 for case 5 due to the great variability of radiation during the day.In general, the correlation of variables is improved by executing the 3 procedures.Of all the procedures evaluated, relationship 1 helps correcting all the measurements to a greater level; therefore, it is necessary to perform a correct measurement of the current resulting from the PVS.Additionally, it should be kept in mind that accuracy of the measurements is also associated with instrument calibration, sensor failures and linearity [38], [39].

CONCLUSION
In the present investigation, 3 procedures were developed to allow the detection and correction of measurements of a PVS system according to IEC 61724.The procedures were evaluated with 6 case studies, using current, irradiance, temperature, and voltage measurements of PVS.The separate application of each procedure leads to a larger number of measurements to be corrected.However, the application of the 3 procedures simultaneously allows for a reduction of the number of measurements to be corrected and aligns the measurements towards a more coherent plane based on the SDM model.After the correction process, the coefficient of determination between irradiance and current of up to 0.9975 has been achieved for a clear sunny day.For situations with higher irradiance intermittency, up to 0.9961 has been achieved and for the case of low irradiance situations, this indicator has been increased up to 2.5%.Based on the indicators obtained, we found that the current-irradiance ratio is more sensitive to certain measurement errors that may occur.Thus, the appropriate devices and instruments should be used depending on the purpose of the measurements.

ACKNOWLEDGEMENT
This work was funded by CONCYTEC through the PROCIENCIA program under the "Applied Research Projects 2022-02", according to contract [PE501078077-2022-PROCIENCIA].
(3) represents cell temperature model based on the nominal operating conditions of the PV cell.The nominal operating cell temperature (NOCT) corresponds to 43 • C of PV cell operation with 800 W.m −2 , T amb and T mod represent the ambient and PVS temperature respectively.

Figure 4 .
Figure 4. Condition of sky according to irradiance measurements (a) clear, (b) overcast, and (c) high

Table 2 .
Measurements used in this study and acronyms Gm Inverter SMA Sunny Tripower 5000 TL Measure voltage and current in DC A and V Ipv m and Vpv m

Table 3 .
Cases evaluated and classified by irradiance profile

Table 4 .
Measurements corrected for each procedure

Table 5 .
Estimation of the coefficient of determination after applying the proposed procedures