Application of the least square support vector machine for point-to-point forecasting of the PV power

Received Jun 12, 2018 Revised Nov 11, 2018 Accepted Mar 3, 2019 In today's industrial world, the growing capacity of renewable energy sources is a crucial factor for sustainable power generation. The application of solar photovoltaic (PV) energy sources, as a clean and safe renewable energy resource has found great attention among the consumers in the recent decades. Accurate forecasting of the generated PV power is an important task for scheduling the generators and planning the consumption patterns of customers to save electricity costs. To this end, it is necessary to develop a global model of the generated power based on the effective factors which are mainly the solar radiation intensity and the ambient weather temperature. As a result of the wide numerical range of these parameters and various weather conditions, a large training database must be used for developing the models, which results in high-computational complexity of the algorithms used for training the models. In this paper, a novel algorithm for point to point prediction of the generated power based on the least squares support vector machine (LS-SVM) has been proposed which can handle the large training database with a very fewer deal of computation and benefits from reasonable accuracy and generalization capability.


INTRODUCTION
Solar photovoltaic (PV) energy sources are increasingly growing and gain more popularity than conventional fossil fuel sources [1]. In the last decades, the manufacturing and installation cost of photovoltaic panels has had a significant reduction and their energy efficiency has been improved [2,3]. The solar energy is one of the most favorable green energies among renewable energy resources, because it is an inexhaustible, clean, and safe energy source [4]. In spite of its affluence and lower cost, yet the solar energy supplies less than 1% of the world's energy needs [3,5]. The dependency of the output power of PV systems on weather conditions and the uncertainty of output power of PV systems are significant challenges that power system operators encounter with [6]. In addition, integrating renewable energy sources (RES) into the existing grid raises several challenges. Renewable energy sources are naturally intermittent, and the weather condition affects the power generation, hence, the stability of the grid. The operation of a power grid involves matching the supply and load. The mismatch between supply and load is considered as grid disturbance. The frequency of the grid is mostly 50Hz or 60Hz. The frequency decreases if load is greater than supply, otherwise it increases. The grid stability requires the frequency to be fixed. When a generator fails, load will be greater than the supply. In this condition, if load couldn't be matched to the supply, then it may cause failure of the other generators as well.
Accurate forecasting of PV power generation is useful for both the power grid and individual smart homes. The data obtained can be used by the grid to schedule the generators. Also, the smart homes can use it to plan their consumption patterns in order to save electricity costs [6]. Accurate forecasting of PV power generation is important for system reliability and developing large-scale PV deployments [4]. By increasingly integrating the PV systems into the existing power resources, the reliable and accurate PV system identification would be essential in order to deal with the highly nonlinear change in dynamics and operational characteristics of the PV system [1].
Increased capacity of renewable power sources is a main driving factor for sustainable power generation [7]. In recent years, penetration of such renewable sources, along with their intermittence power generation, has posed challenges to power engineers [8]. The rapid growth of such resources has resulted in some structural changes in power grids [9]. The reliable operation and economic dispatch (distribution) of power generation are usually planned real-time, daily, weekly, monthly, and yearly. Hence, it is important to predict the load profile of power systems and their generation outputs including the conventional centralized power plants and renewable power stations such as photovoltaic (PV) and wind [10].
Power output forecasting of RES is important for grid operators. It helps the grid operators to predict the shortage or surplus time of RES and make necessary decisions. In addition, it is useful for scheduling, energy storage, management and maintenance of the power systems [3]. The prediction error of short-term PV power output is in the range of ∼10-20%. The prediction accuracy is typically lower in the mornings, evenings, or under rainy weather conditions, and under some conditions the relative mean square error (RMSE) can be even higher than 50% [4].

LITERATURE REVIEW
The forecasting approaches of PV generation can be classified into three categories including numerical weather forecasting (NWP)-based approach, data-driven statistical approach and hybrid approach [11]. NWP-based forecasting approach has been discussed in [12], which uses the first principles of solar irradiance and PV generation prediction. The second approach includes auto regression (AR)-based models [13] and computational intelligence tools such as artificial neural networks (ANNs) [14,15]. The third approach is a combination of the two first approaches [16,17].
Hammer, A. et al. proposed a PV output forecasting method in which the information concerning cloud movement in local area for very short-term horizons in satellite images are used for predicting process. This can improve the forecasting accuracy to some extent [18].
A one-day-ahead PV power output forecasting model is proposed for a single station by Jie Shi et al., which is based on weather forecasting data, actual historical data of power output, and the SVM algorithm. They applied the model into a PV station in China [4,19]. To predict the output characteristics of a commercial PV module, [14] uses a radial basis function neural network (RBFNN) model. However, considering the need for large training data, both SVM and RBFNN models suffer from the disadvantage of high-computational complexity [6].
Nowadays, thanks to the advances in Meteorology, accurate forecasting of the sun radiation intensity and the ambient weather temperature has become possible. Using a global database of the generated electric energy and these parameters, it is possible to provide a precise prediction of the generated electric energy in various intervals of the day and by summing them an estimation of the daily electric energy can be obtained. To yield a high degree of accuracy, it is necessary to use a large database which covers various weather conditions. As a result of the SVM and RBFNN models' computational complexity, such a database is very large for these models and therefore simpler models should be investigated.
Least squares support vector machines (LS-SVMs) are the extended version of SVMs, proposed by Suykens et al. [20], which benefit from reduced calculations besides higher generalization capability [21]. In this paper, a novel approach for point to point prediction of the generated PV power based on LS-SVMs has been proposed which can provide the necessary forecasting accuracy and does not suffer from the highcomputational complexity. The rest of the paper is organized as follows: Section 3 describes the least squares support vector machines and Section 4 presents the results and analysis.

LEAST SQUARES SUPPORT VECTOR MACHINES
Support vector machines (SVMs), are supervised learning models which are used for classification and regression analysis. Initial version of support vector machines was an extension of the Generalized Portrait algorithm, which was developed in 1960s. However, the current version was developed by Vapnik and his coworkers at AT&T Bell Laboratories in 1990s [22]. In SVM-based regression, to estimate a function as the one shown in (1) based on a limited set of observations, first the input space is mapped into a high dimensional feature space through a kernel function ( ), next a linear optimal regression is carried out in this space [23].
Where C is the regularization factor, is the insensitivity parameter and and * are slack variables, calculated based on the Vapnik's -insensitive loss function, like on (3).
The concept of -insensitivity in SVM-based regression is shown in Figure 1.

Figure1. The illustration of -insensitivity
When using classical SVMs, the optimization problem is solved by quadratic programming optimization. However, this method suffers from high computational load for the constrained optimization programing. Least squares support vector machines (LS-SVM) can deal with this disadvantage, by solving a set of linear equations rather than a quadratic programming problem [25]. When LS-SVMs are used for function estimation, the optimization problem is formulated as minimizing a risk function which is defined such as (4) follows: Where ei is the error variable related to the i-th training data, as shown in (5) and C≥0 is the regularization constant. The Lagrangian is Where αi∈R are Lagrange multipliers. Optimal condition is obtained by solving the following set of (5): Substituting w and e in the Lagrangian, the following KKT linear system of equations can be obtained such as (7), (8), (9) [26]: According to the Mercer's theorem [27], the inner product 〈 ( ), ( )〉 can be defined through a kernel K (x, xi). Therefore, the function estimation based on LS-SVM can be described as (10), in which and are the solutions of (7).
The most common formulations for the kernel function are listed in Table 1.  The key feature in SVMs is to minimize both the structural risk and the empirical risk which results in model sparseness besides accuracy. In many applications in which SVMs are used, they result in excellence results.

RESULTS AND ANALYSIS
According to (10), LS-SVM based regression analysis can be used for the prediction of the generated power in PV generation system based on a limited number of measurements. The smaller deal of calculations in LS-SVM modelling provides the possibility of using a global database of the generated power based on five minutes measurements. For this purpose, a database of generated PV electric energy together with the corresponding measurements of temperature and solar radiation intensity, measured every five minutes for 60 days, was used. From this database, measurements of 44 days were used for training the models, which included 3765 measurements with omission of the night hours. The input parameters were considered as the solar radiation intensity and the ambient temperature and the target was considered as the generated energy in units of kWh in each five-minute interval. The input and target parameters were scaled in the range of [-1, +1] as In which max and min are the maximum and minimum value of the input or the output among the whole dataset, respectively; is the input or output and is the corresponding normalized value. Using the normalized measurements, LS-SVM models were implemented by the LS-SVM lab toolbox version 1.8 of MATLAB. In this toolbox, a function is provided for tuning the kernel and model parameters [28]. The RBF kernel function was preferred. The values of the regularization parameter (C) and RBF kernel parameter ( 2 ) are listed in Table 2. The outputs were predicted by the models and were scaled to their original range as In which ̂ is the predicted output in the original range and is the normalized output predicted by the models. To evaluate the accuracy of the LS-SVM models, the measurements of the generated power in each five minutes for 16 other days were used. Since the weather temperature is mainly predicted based on its values in the morning, noon, afternoon and at night, a two-term exponential function, defined in (13) was used for modelling the ambient temperature.
In (13), T is the ambient temperature in degrees Celsius and t is the time. In case of the solar radiation intensity, the original measurements were used, since this parameter can be obtained based on the satellite images.
Based on the trained LS-SVM models and the predicted values of temperature in each five-minute interval, the electric generated power in each interval was estimated and the daily forecasted power was obtained by summation of these values. The predicted outputs alongside with their corresponding target values are depicted in Figure 2, which depicts the LS-SVM models' precision. The prediction accuracy of the final models was evaluated based on the maximum absolute percentage error (MAPE) and the coefficient of determination (R 2 ) statistical indices, defined in (14) and (15).  (15) In these (14) and (15), and ̂ are the measured and the predicted outputs, respectively, N is the number of test samples, and are the maximum and minimum values of the measured outputs and ̅ is the mean value of the measured output, calculated as (16).
The calculated values of indices are listed in Table 3.

CONCLUSION
Accurate forecasting of the generated solar photovoltaic power is an important task in electric distribution system. In this paper, a novel point-to-point forecasting methodology has been presented for this purpose, which is based on the least squares support vector machine (LS-SVM). The reduced necessary calculations of the LS-SVM algorithm provided the possibility of using a large training database, corresponding to the wide numerical range of the ambient weather temperature and solar radiation intensity. A two-term exponential function was used to model the ambient temperature and the temperature in each 5-minute interval was obtained. Using the trained LS-SVM models and the predicted values of temperature together with the values of solar radiation intensity, the generated electric energy in each interval was obtained and the forecasted total electric energy was calculated by summation of the obtained values in each day. Error analysis of the results show that the proposed method benefits from a high degree of accuracy and generalization capability, and reduces the computational complexity compared to the previous methods and therefore can be used for accurate forecasting of the generated PV power.