Optimization of open flow controller placement in software defined networks

Received Oct 2, 2019 Revised Dec 20, 2020 Accepted Jan 13, 2021 The world is entering into the era of big data where computer networks are an essential part. However, the current network architecture is not very convenient to configure such leap. Software defined network (SDN) is a new network architecture which argues the separation of control and data planes of the network devices by centralizing the former in high level, centralised devices and efficient supervisors, called controllers. This paper proposes a mathematical model that helps optimizing the locations of the controllers within the network while minimizing the overall cost under realistic constrains. Our method includes finding the minimum cost of placing the controllers; these costs are the network latency, controller processing power and link bandwidth. Different types of network topologies have been adopted to consider the data profile of the controllers, links of controllers and locations of switches. The results showed that as the size of input data increased, the time to find the optimal solution also increased in a nonpolynomial time. In addition, the cost of solution is increased linearly with the input size. Furthermore, when increasing allocating possible locations of the controllers, for the same number of switches, the cost was found to be less.

the switches within the network and therefore, reduced latency to the network subscribers [8]. Accordingly, such optimization decreases the transmitted power required to send the packets to their destinations. Once the transmitted power is reduced, the power consumption of the networks is reduced too [9]. At the end, the network efficiency shall be enhanced. Hence, this work sheds light upon the controller's placement under realistic constrains to minimize the cost while determining the number and the types of controllers to be optimised [10]. Different constraints and network metrics, such as controller capacity, controller and link types, link bandwidth and connectivity of the network devices have been considered. To achieve this, the following contributions have been made:  Given the many changes related to the controller, we presented a model that decides to supply the ideal number, area and type of controller at the same time. The purpose of the model is to reduce network costs by taking into account requirements such as the controller's range, contact area and power management distance.  Evaluation of the offer of the corresponding model and create typologies in different sizes.  Develop typologies areas that determine how the number of potential areas may affect utilities.  To evaluate organizational performance by expanding the size of the network in order to estimate the cost.
Assessment of the controlling body and limitation of the number of controllers as indicated by the dynamic understanding of traffic and the improvement of the use of resources by the controller's network to change links. We have provided for a direct ability to monitor the performance of the controllers based on continuous feedback from the current monitoring body in accordance with mandatory regulations. We have also proposed a reclassification algorithm to speed up the adjustment of the burden between controllers. In addition, a failing element aims to solve the problem of controller disappointment [11].

RELATED WORKS
In order to define the controller problem, a mathematical model is required to represent the behavior of SDN traffic [12]. We have used a solver called CPLEX to returns the optimal solution or the best solution found when the time limit is reached. Since the first introduction of the SDN controller issue by Heller et al. [7], many researchers have proposed different algorithms for dealing with one of the most difficult problems facing an engineer with respect to deploying an SDN network: the placement of controllers in the network.
A formidable challenge associated with solving SDN controller placement problems is that all of the algorithms proposed involve a tradeoff among scalability, resilience, and model expansion [13,14]. With respect to investigations of the SDN controller problem, the technical paper published by Heller et al. [15] is one of the most cited. The authors proposed a heuristic approach to finding the ideal role for controllers in large SDN organizations. In this study, the main prediction was the normal inactivity scenario, which is considered essential in determining the inertia estimates required for large-scale SDN use [16]. The approach is dependent primarily on propagation delay, with the location of a controller being based on the shortest path between switches and controllers that have been assigned in the network topology [17]. This study offered the most accurate solution for addressing the problem. An interesting conclusion was that increasing the number of controllers does not necessarily decrease the average latency between switches and assigned controllers [18].

RESEARCH METHOD
We received an undirected organization topology ( ,), where indicates the arrangement of switches and is the arrangement of edges in the middle. Let speaks to the arrangement of dynamic controllers and is the quantity of controllers, which is 1 as a matter of course. =[ ] | |× means the task relations among switches and controllers, in which each section =1 if switch interfaces with controller and =0 something else. For the gathered insights, speaks to the preparing time for occasion dealt with by controller and is the normal number of streams requiring for arrangement at switch in current time.

Constraints
Our objective is to minimize the number of controllers' considering a series of constraints, including as, and ∀ ∈ , < and ∀ ∈ , < and ∀ ∈ , < and ∀ ∈ , < Finally; ∀ ∈ , = 1 As shown in (1) to (5) for each controller specify the maximum CPU usage, memory usage, normal number of hours per second and normal number of dropped packets, and normal packet preparation time. Missing packet include both the packet provided by the controller and the associated switches. A packet issued by the driver ensures that the driver will be overloaded. Assuming we do not think about good and updown switch times, the packet that removes the associated switches also shows that the controller ignores the flow setting problems. As shown in (6) determines whether each switch is assigned to one controller and one controller [19,20].

Evaluation function
In order to evaluate the utility of a controller, an evaluation function is defined a function of the 5 metrics for each controller: Each of the 5 metrics reflects the controller performance. We design the evaluation function as follows though there is more than one possible expression.
In this formula, , and ω are coefficients that can be redone to alter the general centrality of the 5 measurements.
is standardized to [0,1] expecting to rearrange the count. The ordinary range of is indicated as [ ], where >0 and <1. When > , it implies controller is overburden and subordinate switches ought to be reassigned to different controllers. On the off chance that no dynamic controllers have enough limit (regarding parcels handling amount), another controller will be enacted to assume control over the unassigned switches. When < , it implies the related switches of controller can be converted to those of other controller to chop down assets. Thinking about the requirements, when the estimation of a measurement goes past the ordinary area, the estimation of the assessment capacity should see strange consequently. Subsequently, the assessment capacity can be changed as (8) [21][22][23].
where * = + + + +ωΣ and || is the logical operator OR. If any metrics violate the constraints, will become 1 immediately and trigger the centralized scheduler program.

Reassignment
In the attached schedule, the contacts associated with each driver are sorted by traffic key. Meanwhile, the controls are set at your fingertips [24]. The largest stacking switches in the overload controller are initially switched to another dynamic controller with the highest available limits for example, packet processing probability and are intended to regulate steel between controllers over a period of short time [25,26].

Failover
In the failover component, the selection program regularly checks the pulse of each controller as well as the connected status of each switch in each time unit [27]. When you identify inactive controllers or unspecified switches, you set up a unified scheduling program to resolve the issue. We address the issue in two stages: i) check that there are controllers available between actors and partners for unspecified changes; ii) If no accessible controller is found, start another controller to check the switches. The connection cycle is similar to the redefinition cycle, for example, the most important requirement is the heaviest steel switches and upper limit controllers [28].
In our case, all controllers have only an environmental perspective on the topology of the network. Since in our situation we would expect the controllers to only deal with layer 2 learning and startup problems, they can, of course, learn the MAC address and specify the address. Thus, after the redefined cycle, controllers can become aware of the topological change and adapt as needed. To start at level 3, the controller must find a way to obtain information about the new topology at any time when re-mapping the switch, which will be considered in the next step.

Evaluation one -Failover
As mentioned in the assessment system, the statistics application is acquired by the manager for the statistics application of the topology partner Mininet, which always examines the position of the association between the switches and the activity of the manager. If there is no chance that the connection status will stop, an immediate opportunity for the external scheduler is created in the scheduler. Hence, the scheduler starts immediately and designs another/current controller for users who compare switches. Our basic evaluation results show that the switches can be reconnected to the ACTION counter in 3 seconds. One of our future tasks is to maintain a design table for the connection between controls and switches. Accordingly, the "DOWN controller" capability of the scheduler can be used as soon as the scheduler can justify designing another/current controller for switches originally designed for the DOWN controller. This saves up to 3 seconds to achieve takeover results.

Evaluation two
In normal situation, it does not create enough load to run a "pingall" command on the Mininet topology head mentioned above (27 hosts, 13 switches) to overload the POX l2_learning controller. One of the issues in this review was finding the normal way to recreate situations where the administrator has to request an event load (normal packet response time sometimes occurs completely, or even events occur.). We first tried extending the stack with a handler by adding additional data and changing the topology of Mininet, but "pingall" executes the "ping" command separately between 2, so PACKET_IN-events are usually the same than the size of topology, ie. We decided to look for an answer to limit the administrator's ability to handle PACKET_IN can be adapted to accelerate administrator overload. It turned out that the Linux utility "cpu limit" could definitely mess up the processors, and tests show that it worked perfectly for our needs as shown in Figure 1.
Another problem was that the package preparing the details for the Mininet switch could change from topology to topology, but there is only one PACKET_IN control pause of the controller, regardless of the topology of the hidden switches. To make the administrator aware of the various provisions that violate the work with their hidden switch, our responses identify the possibility of a break from the perspective of a topology change. Our response took advantage of Mininet's integrated "ovs-dpctl" tool when our general audit was performed on the Mininet (OVS) [2]. The return of the "ovs-dpctl" tool has a field called "hit" that shows the number of packets preparing for the pause that takes place at the OVS level. In our response, the validated content was sent to the Stats Collector application in the Mininet topology tool, which again controls the ovs-dpctl number and will send "Overloaded controller", which activates program capabilities a to perform the reclassification calculation when the number of "hits" increases. This is the method to enable the ability to perform reclassification algorithm.
In this review, another "Controller-Overloaded" indicator is that the manager's regular preparation time PACKET_IN has been halved from the previous overview. To do this, the scheduling application is scheduled at regular intervals to check the normal processing time of the PACKET_IN apartments collected by the Stats Collector program in the staffing table. When you recognize half the slope of the previous view, it performs the transfer calculation. This is a research method to perform the transfer calculation. Figures 2 and 3 describes these two conditions. Our underlying review results show that it is possible to switch to a typical controller that is normally stacked within 8 seconds in lift mode, but this can usually be stopped within 3 seconds in a single trigger mode. Advanced SDN tools, such as Mininet and OVS, give us the convenience of re-creating the SDN system. Either way, it is not easy to emulate a huge organization with minites with limited resources. Initially, we tried to run Mininet topology and many POX controllers in a similar virtual machine (VM), but the resource value between Mininet topology and the regulator shares the regulator exposure, especially as the size of the organization evolves. This problem can be understood when the CPU and Mininet memory can be separated from the POX controllers.
Later, we proposed running Mininet and POX controllers on individually connected virtual machines. The loading of the simulated network could be easily adjusted by simply modifying the number of bridged VMs, which increased the flexibility of the simulation. To simulate the performance of the controller, we have the "cpulimit" tool to restrict the correct processor for each controller. It will be more advantageous that this limitation of processor usage may later become an indirect part of SDN tools. Some visualization tools are suitable for Mininet regulators. In any case, it seems that most of them do not defend the powerful topology of the network, which is a major problem with SDN research. Moreover, they are not feasible under the compatible circumstances. The video display and information display select more adaptable and user-friendly means of data analysis.

CPLEX optimization
As the problem size increases, it becomes very difficult (not to say impossible) to solve the problems manually and that is why a solver is needed. The optimization for this scenario is run on IBM's ILOG CPLEX Optimizer version 12.5. The optimizer runs on a single thread. The process that is shown in Figure 3 is run on a computer that has the solver installed.
We are already aware of the solution to this problem but the optimizer will validate the above solution. Furthermore, the optimizer suggests the time taken to get the solution, which is an important information when the problem size increases. The solution is plotted and shown in Figure 3. The links connecting the switches to the controllers are shown with either solid blue or green colour. A blue colour indicates the cost for the links of type one (1 Mbps). The green colour is the cost for the links of type two (10 Mbps). Similarly, the controller placements are coloured and are placed on top of the possible placement markers. The yellow colour indicates the first controller type and the red colour indicates the second controller type. As we can see, the result of the optimization matches the solution found earlier. The links between controllers are always the same type of colour, since we assume that it takes minimal bandwidth to have controllers communicate with each other. Because the goal of the planning model is to place controllers on a network, traffic between controllers is not considered and in our model, the cheapest link speed should be good enough for connecting controllers together.

Small to large input sizes
The optimization should be measured using a variety of methods. One way to measure, it is to keep track of the cost for diff t ranges of solutions. Another method is to keep track of the time taken by the solver to fi an optimal solution. The cost of the solution should be increasing as the number of switches increases. This is because the input increase means more controllers may be placed and all those switches must be connected. The number of switches in the topology is directly related to the number of links placed by the solver between switches and controllers. Since every switch must be connected to a controller, it increases the cost of the solution.
The input for the model is very important and, in this section, we show what the input for controllers may be. Today many controllers exist and have been tested in real scenarios. Each of these controllers has their own specifications. The specification of each controller type depends on the hardware used and the implementation of the programming language. Table 1 shows four types of controllers that are used by the solver.
The specification of the link types that connect controllers and switches together is easy to determine. On any networking sales website, one can determine the cost per meter and the types of links that are available. Currently, a meter of 100 Mbps ethernet is about $0.25 and $0.63 for a meter of 1 Gbps ethernet. Furthermore, over 10 Gbps fi er optic cable is on average $29 per meter. Table 2 shows the link input available for optimization.  When the error bars are wide, it indicates that the results are less reliable because the range covers a wider set of values. This is shown in Figures 3 and 4. We can improve the error bars by running more than four instances for each problem. Decreasing the error bars is possible if we decrease the standard deviation since standard deviation is related to the square root of the number of instances each problem is repeated. Looking at the figure, when |P | is 20 and |S| is 100, the error bars are very wide. To make the bars shorter, we need to make the standard deviation smaller by running each problem more than 4 times. If we wish to decrease the standard deviation by half, each problem would have to run 16 times. When |P | is 20 and |S| is 100, each problem instance takes 16 hours to complete. This would mean that for 16 runs, it would take 256 hours to complete all the 16 instances. The standard deviation for four instances is 8,490 seconds and the confidence interval is ±27,019 seconds. Assuming the same mean is used, if we were to run the problem 16 times, the confidence interval would decrease to ±9,056 seconds (because standard deviation decreased by half). This is 33% of the confidence interval when we compare it to the four instances from our result.
Also, in Figures 3-4, we can see that the error bars are wide for |P|≥15 and ISi 2:50. We can improve the reliability of the result for problem 31 by running an instance of the problem 16 times. Since the average time for that problem is about 13 hours (46,219 seconds), running it 16 times may take 205 hours (15 fold increase). The results show that the total time taken to find optimal solutions for every problem run four times is 19 days (1,656,776 seconds). Improving the reliability by running all of the problems 16 times is not practical because it would have taken 306 days to find the solutions. The solution time was expected to be high because integer programming problems fall into NP-Hard problem types.  show the plots of the results grouped by IPI-The plots show the solution time and solution cost against the number of switches. The figures show more details than the figures that have all the IPI in one graph. The reason for this is that when IPI is small, the range of values for the solution time and the cost are much lower than when IPI is of a higher value and this allows us to better analyze the graphs. For example, if we look at the time taken to find an optimal solution when ISi is 30 and IPI is 5, we cannot see any details in