Application of swarm intelligence algorithms to energy management of prosumers with wind power plants

Received Feb 16, 2020 Revised May 29, 2020 Accepted Jun 8, 2020 The paper considers the problem of optimal control of a prosumer with a wind power plant in smart grid. It is shown that control can be performed in non-deterministic conditions due to the impossibility of accurate forecasting of the generation from renewable plants. A control model based on a priority queue of logical rules with structural-parametric optimization is applied. The optimization problem is considered from a separate prosumer, not from the entire distributed system. The solution of the optimization problem is performed by three swarm intelligence algorithms. Computational experiments were carried out for models of wind energy systems on Russky Island and Popov Island (Far East). The results obtained showed the high effectiveness of the swarm intelligence algorithms that demonstrated reliable and fast convergence to the global extreme of the optimization problem under different scenarios and parameters of prosumers. Also, we analyzed the influence of accumulator capacity on the variability of prosumers. The variability, in turn, affects the increase of the prosumer benefits from the interaction with the external global power system and neighboring prosumers.


INTRODUCTION
The development of renewable energy technologies allows consumers to receive electricity not only from an external centralized system, but also from their sources, such as wind power plants (wind turbines) and solar panels. If there are favorable climatic conditions beneficial in terms of renewable energy sources and it is possible to place a sufficient number of wind turbines or solar panels to generate power significantly more own needs, then the consumer can not only receive electricity from an external system but also sell the power [1][2][3]. In this case, a two-way flow of energy and information arises, which is fundamentally essential for the implementation of the smart grid concept. In this case, the consumer can be called a prosumer of "generating consumer" (GC). Since the cost of electricity for the consumer is not a constant value, the problem of optimal control of the GC arises [4,5]. The essence of the problem is to regulate the flow of electricity. It means to determine volume and timing of buying or selling energy, volume and timing of storing energy vice versa taking previously stored energy. Since the intermittent nature of renewable (wind, solar) energy, prosumers need to use an energy storage system [6,7].
The prosumer operates under conditions of stochastic change in the generation of electricity by renewable sources and, to a lesser extent, of its consumption. In addition, the control problem has a high dimensionality of the solution search space, and the objective function is not an analytical expression, but is calculated algorithmically. Therefore, the task requires the application of methods that allow solving such complex optimization problems. Such practices include metaheuristic stochastic methods.
Much modern research has been devoted to optimal control in smart grid networks with distributed generation and renewable energy sources [7][8][9]. However, the optimal control is carried out at the level of a supersystem in them, and not individual GC. Studies [10][11][12] proposed different systems (frameworks) to real-time coordinate load scheduling, sharing, and trading for distributed electric power systems. Such management allows taking into account data on all participants in the distributed electric power system, but there is a risk associated with the centralization of control. Other approach is based on cooperative game theory including Stackelberg game approach [13,14] and stochastic game approaches [15].
Two large GC are considered: the power system of Russky Island and the power system of Popov Island. Both islands are located in Peter the Great Gulf in the East Sea. High wind speed makes it possible to create wind power plants up to 16 MW on Russky Island and up to 20 MW on Popov Island [16]. The task of optimal control is to create a control system that implements a sequence of actions on a controlled object (dynamical system) to achieve the best possible quality specified by one or more criteria (objective functions). The controlled object is a specific part of the world around which the control subject can purposefully influence. A detailed description of the principles of optimal control can be found in [17]. Control always occurs during a certain period of time, while the controlled object passes from one state to another. The state of the controlled object is characterized by a set of parameters that can change over time: Thus, there is a vector of functions. Each function shows the parameter changing over time. These functions in the explicit form are unknown. In addition, there is a control system that provides control. The control can also be defined as a vector of functions A(t) = {a1(t), a2(t), ..., am(t)}. The notation S from "state" and A from "action" are used.
Control parameters can be defined as follows (m = 3):  the amount of electricity that is currently exchanged by the GC with an external system (purchase or sale), MWh (a1);  the amount of electricity that is currently being transferred by the GC with the neighboring GC (purchase or sale), MWh (a2);  the amount of electricity that the GC is currently charging or discharging, MWh (a3).
The control does not affect the state parameters associated with the GC consumption and generation, but it directly affects the accumulator charge. In this task, the time step is set equal to one hour. So, each day contains 24 values of the three state parameters and 24 values of the three control parameters. An example is shown in Figure 1. The optimal control problem, in general, can be written as follows: where: A opt (t) is the required optimal control; it defines values of the control parameters at each time moment (when and how much GC must sell or buy, charge or discharge); A pos is the area of permissible values of control parameters; f(t, S(t), A(t)) is a continuous-time cost function, it defines the GC benefit and quality of the control; t0 and tT are the period of time considered.
Due to the high complexity of power systems in an explicit analytical form, the function f(t, S(t), A(t)) cannot usually be obtained, especially integral of this function. But it is possible to calculate the function algorithmically. In the case of GC control, this function is piecewise continuous, since the time step is 1 hour. The task (1) can be written without an integral, in the form of a sum, and the function f(t, S(t), A(t)) is nothing more than the difference between the revenues from the sale of electricity of a GC and the costs of its purchase, generation, and accumulation in all hours into the time period. However, even in this case, the analytical expression for f(t, S(t), A(t)) is difficult to write, since the price of electricity is a piecewise constant function, the exchange of electricity with a neighboring GC supply depends on its state and controlling them. Thus, the calculation of the value of f(t, S(t), A(t)) should be performed algorithmically.

RESEARCH METHOD 2.1. Rule-based GC control
A characteristic feature of the problem (1) is the assumption of making a management decision every hour. Moreover, our analysis showed that all possible control actions could be described by dividing them into four groups. The following designation is used:  power_wind -GC wind power plant generation at the considered hour;  power_gc -GC consumption at the considered hour;  difthe difference between the GC generation and consumption at the considered hour;  accumthe amount of energy that needs to be charged (> 0) or discharged (<0) at the considered hour;  now_accumthe energy stored in the accumulator at the considered hour;  max_accumthe maximum amount of energy that can be stored in the accumulator (constant, GC parameter);  max_accum_hthe maximum amount of energy that can be added to the accumulator in one hour (constant, GC parameter);  sale_accumcoefficient that regulates the balance of purchase and charging (parameter should be tuned in the optimization process  sale_unloadcoefficient that regulates the balance of sales and use of discharging (parameter should be tuned in the optimization process);  sale_buythe amount of energy that is sold (> 0) or purchased (<0) at the considered hour. The choice of actions should depend on the state of the GC, but it is enough to get answers to two questions. The first is connected with determining whether the GC is in a state of excess or deficiency of energy? The second is also related to the fact that the price of electricity changes throughout the day. Although various billing schemes are possible, a two-zone tariff is considered in this research, the daily tax is from 7 a.m. to 11 p.m., and at other hours it is a night tax, cheaper one. Thus, it's needed to get answers to the questions: a) Excluding accumulation, does the generation of the GC wind power plant more than the GC consumption (diff> 0)? b) Is there a special time period now?
The GC control takes into account the possibility of using two intervals as special periods (from time1 to time2 and from time3 to time4), the values of the boundaries of the time intervals are parameters adjusted during the optimization process.
The second and third actions can be performed under any of these four cases (conditions). The first action is possible only if diff> 0 (excess), the fourth action is possible only if diff <0 (deficit). When creating a GC control based on rules, we get 12 rules of the form IF <condition>, THEN <action>. The number of rules is 12 since the second and third actions can be performed under any of the four conditions, and the first and fourth under two conditions (2 * 4 + 2 * 2 = 12). In addition, the GC control model has four balance factors: buy_unload, sale_unload, buy_accum, sale_accum, and 4-time moments as the boundaries: time1, time2, time3, time4.
To control using these rules, we need to determine the procedure for their verification and compliance, that is, rule priorities. Decision making begins with checking of the highest priority rule. If its condition is satisfied, then the corresponding action of this rule is implemented. Otherwise, the next priority rule is checked, and so on until the end of the rule list. The conditions are designed in such a way that when you go through the list of rules, you will surely find one whose condition will be satisfied. As a result, to build a controller, it is necessary to determine the order of the rules by setting priorities (pri) and the tuned parameters specified above: Solution = [pr1, …, pr12, buy_unload, sale_unload, buy_accum, sale_accum, time1, … , time4] The energy capacity of the accumulator is also very important. It does not change while GC is working, so this parameter is carried outside the scope of the optimal control problem. To study its effect, we performed modeling with several capacitance values.

Swarm intelligence application
Swarm Intelligence (SI) algorithms are one of the most effective ways to solve complex optimization problems [18,19] including optimization of power systems [20][21][22]. We mean non-linear, non-differentiable, high-dimensional problems with complex topology of the solution search space, stochastic and dynamic properties. It is not always possible to determine the Swarm Intelligence algorithm that is most suitable for a solved task. Therefore, the use of only one algorithm can give a solution whose effectiveness is not satisfactory for the optimization criterion. In this case, the researcher cannot determine the effectiveness without using other algorithms for comparison. Therefore, three Swarm Intelligence algorithms were applied: the particle swarm optimization (PSO) algorithm [23], the firefly optimization (FFO) algorithm [24], and the bees algorithm (BA) (not Artificial Bee Colony Optimization) [25].
For applying SI algorithms, it is necessary to determine the mapping of the particle coordinate (X) in the search space solution to the solutions of the solved task. In this case, the solution is the control actions A(t), as shown in expression (1). Therefore, it is necessary to map from X to Solution. Thus, we obtain sets of the rules' priorities and the values of the tuned parameters as shown in Table 1. Each element of the vector X is bounded from 0 to 1 [10]. The priorities are real numbers from 0.0 to 1.0, so pri = xi, i = 1, ..., 12. The parameters buy_unload, sale_unload, buy_accum, sale_accum also take values from 0.0 to 1.0, so they are mapped in the same way. Finally, time1, ..., time4 defines the hours. Therefore, it is enough to multiply the corresponding x by 24 and round down (the hour from 0 to 23).
For metaheuristic optimization algorithms, the selection of heuristic coefficient is critical [9,10]. In this research, a separate study of the influence of the heuristic coefficients was not carried out. We used several sets of heuristic coefficient values that showed high efficiency in our previous studies about  [20]. The FFO algorithm requires comparing each particle to each other, so the number of operations quadratically depends on the number of particles. The PSO and BA have a linear relationship. We reduce the number of FFO particles to equalize the calculation time. At the same time, we increase the number of iterations of the FFO algorithm to equalize the number of calculations of the objective function. As a result, the number of particles is reduced four times, and the number of iterations is increased four times compared to the PSO algorithm and the BA. The parameters of the SI algorithms are given in Table 2.  In addition to the SI algorithms, a Gradient Descent algorithm was applied for comparison. It has a fundamentally different principle than metaheuristic SI algorithms. Also, it has fewer heuristics coefficients, then SI algorithms. The applied Gradient Descent algorithm can be written as a recurrence formula in the following form: In this work, the coefficient α is 5•10 -5 ; and the vector X, as for SI algorithms, is a vector of 20 elements from 0.0 to 1.0. Since the objective function cannot be differentiated, the direction of the gradient is determined numerically.

Computational experiment
Computational experiments were carried out while considering the GC of Russky and Popov Islands (GC1, GC2, respectively). So, during optimization, the same control models were built for both GCs. Table 3 shows the prices used in the calculations. The price of electricity from wind turbines takes into account the costs of construction and maintenance of wind turbines, similarly for accumulators. A restriction has also been introduced-no more than 2 MW can be charged in 1 hour. Purchases from an external power system at a night rate 22,22 Sale to external power system at a daily rate 42,86 Sale to / Purchase from a neighboring GC at a daily rate 47,62 Sale to external power system at a night rate 14,29 Sale to / Purchase from a neighboring GC at a night rate 22,22 For calculation objective function (1) on each iteration of the optimization algorithms, we performed the simulation of the operation process of both GC using generations of wind power plants and GCs' consumption data. A sample of generation and consumption daily data for GCs is available through the link https://github.com/Pavel-V/GC_optimal_control_data/blob/master/daily_sample.csv.

RESULTS AND DISCUSSION
Three options (operation modes) for the functioning of the GC are considered (all algorithms were run for each operation mode 20 times with random initial conditions for research of convergence of the algorithms):  GC can buy power from an external power system, but cannot sell or exchange with other GC.  GC can buy power from an external power system and can sell, but cannot exchange with other GC.

6177
 GC can buy power from an external power system and can sell, and can exchange with other GC. As the financial cost of the GCs is the optimization criterion, Table 4 shows this criterion values after optimization according to the operational mode and the accumulator capacity. The financial cost measured in $ per hour. Negative values mean that a GC receives a corresponding profit. That is, the income from the sale of generated power is higher than all total costs of a GC. The SI algorithms have shown absolutely the same solutions at each run; that's why Table 4 has a single column for all SI algorithms. As the results of SI algorithms (PSO, BA, FFO) are the same, it is possible to assume with high probability that SI algorithms have found global extrema for each mode and each value of the accumulator capacity. Gradient Descent has shown results slightly worse and has never achieved the global extremum. In this problem, even a slight relative deviation of the optimization criterion leads to significant financial losses in absolute terms. Also, Figure 2 shows that the relationship between the accumulator capacity and GC financial cost bullied by Gradient Descent is not correct. Using SI algorithms, it's possible to obtain the right relationship. Increasing the accumulator capacity up to a certain level (24 MWh) reduces GC financial costs. It can be seen in Figure 3 that GC1 starts to consume electricity while a cheaper night tariff is in effect, and then spending it during the day. The consumption of GC1 has always been higher than the generation; therefore, it needs to buy part of the power from an external power system, and part from the neighboring GC2. From three to five a.m., wind turbine generation decreased in both GCs, so there is an increase in purchasing power from the external system. GC1 cannot buy only from GC2 since GC2 is not able to sell so much in this period. Although it is necessary to purchase from an external power system at a higher price than the neighboring GC, the process of energy storage does not stop, since buying from an external power system is cheaper at night than buying from a neighboring GC during the day.