Enhancenig OLSR routing protocol using K-means clustering in MANETs

Received Nov 15, 2018 Revised Jan 22, 2020 Accepted Feb 2, 2020 The design of robust routing protocol schemes for MANETs is quite complex, due to the characteristics and structural constraints of this network. A numerous variety of protocol schemes have been proposed in literature. Most of them are based on traditional method of routing, which doesn’t guarantee basic levels of Qos, when the network becomes larger, denser and dynamic. To solve this problem we use one of the most popular methods named clustering. In this work we try to improve the Qos in MANETs. We propose an algorithm of clustering based in the new mobility metric and K-Means method to distribute the nodes into several clusters; it is implemented to standard OLSR protocol giving birth a new protocol named OLSR Kmeans-SDE. The simulations showed that the results obtained by OLSR Kmeans-SDE exceed those obtained by standard OLSR Kmeans and OLSR Kmed+ in terms of, traffic Control, delay and packet delivery ratio.


INTRODUCTION
The ad hoc network is a new wireless communication system without the use of a centralized management infrastructure. It consists of a set of nodes (mobile device) that have a communication wireless device for communicating with entities located in their neighborhood. Each node can therefore directly reach its neighbors direct using its radio interface, or communicate with other nodes inside the network using intermediate nodes (located between the source and the recipient). These are responsible for commuting the messages and playing the role of the router, so it offers an autonomous network and gives the terminal user access to information at any time and from any place. These mobiles cooperate with each other to overcome the Ad hoc network constraints such as dynamic topology, lack of centralized monitoring points, limited bandwidth, etc. the latter requires designing a robust and non-traditional routing system to better management of the flow of information and ensure the quality of service in a dynamic and decentralized network.
Many routing protocols have been proposed [1] on different types of application. The research has not ceased to have efficient protocols that adapt to all mobility models [2]. The routing of the information in the Manets can be classified into two types: Flat and Hierarchical. In the first type all network nodes play the same role, this can overload the network, as well as other problems cause such as scalability and complexity, when the network becomes wider and denser. The second type of routing is used to support networks that are wide and dense. The clustering is a hierarchical structure that makes it possible to group geographically the nodes that are neighbors. This allows to each node to store all information about it group and only some information of other groups (clusters). This approach can reduce the cost of routing of information in large and dense networks. Several researches have been proposed to improve the efficiency of the hierarchical protocols and to support this type of network and structure.
In order to better organize the network into several clusters with optimal structures, the researchers proposed various criteria for building and organizing the nodes in groups. This hierarchic structure leads to optimization and improvement of the routing protocols. Depending on the type of performance that can be ensured in the cluster structure, we use the suitable metric. Several metrics have been proposed to measure the physical and logical properties of the nodes (mobility, density, energy, etc.). A good metric makes it possible to differentiate easily between the nodes [2], to reflect the real behavior of the nodes. and facilitates the performance improvement of protocols [3].
In this article we refine our metric of mobility, defined in [4], taking into account any type of motion in a coverage area of a mobile node, this refined metric and other metrics (density, energy) will be the basis of a cluster building algorithm by using k-means method. To create the groups of mobile and to elect the cluster head, this new approach can generate a more stable cluster and cluster head. In this paper we start with a presentation of some related work. In the third section we present the problem formulation, and in the fourth section we define the clustering. Then in the fifth section we present the description of approach clustering algorithm, in the sixth section we explain our mobility metric, after in the seventh section we present the algorithm before we give the results in eighth section. In the ninth section we finish by a conclusion.

RELATED WORK
In A Several studies provide different clustering techniques to improve the network scalability and simplifying the complexity of routing problem to smaller groups of nodes. They usually differ on the criteria of selection of cluster head. In this part we propose some work done and that focuses on different criteria for the generation of the cluster head, some of this research are implemented in the OLSR [5] (Optimized Link State Routing Protocol) to improve it Qos.
Mobility Based Metric for Clustering (MOBIC): is the original proposal of Basu et al [6] suggesting that clustering, especially cluster head election, must consider mobility as a relevant criterion in order to ensure a certain stability of generated clusters. This algorithm is based on a local mobility metric called "relative mobility of nodes", it is revealed that the node with the low value is the least mobile, ie the most stable. Therefore, it is the node that is elected as a cluster head.
New approach named SALSA presented in [7], it is a distributed and self-organising clustering scheme assigning equal cluster management tasks to all nodes. In addition, a cluster balancing mechanism is introduced allowing nodes to be evenly distributed among clusters. Before the maximum capacity of a cluster is reached, it progressively starts assigning nodes to neighbor clusters. This contribution also proposes a cluster quality metric in order to assign nodes to the most suitable clusters, according to connectivity and free positions within clusters. Results confirmed the performance efficiency of the new scheme, providing stability and low maintenance overhead, even in the largest networks.
An optimized stable clustering algorithm for Ad Hoc (OSCA) proposed by [8], that will provide more stability to the network by minimizing the cluster head changes and reducing clustering overhead. In this algorithm, a new node is introduced which acts as a backup node in the cluster. Such backup node acts as cluster head, when actual cluster head moves out (or died) from the cluster. This practice keeps network availability without disturbance. Further, the priority of cluster head and backup node is calculated based on the node degree and the remaining battery life for mobile nodes. According to the experimental results that proposed an optimized stable clustering algorithm for mobile ad hoc networks (OSCA) algorithm, it will not only be able to make a network more stable by reducing number of cluster head changes but also reduce the clustering over-head.
The authors in [9] present a new approach to build clusters for Wireless Sensor Networks (WSN). The algorithm is based on the k-means method which is well known as a clustering technique. K-means clustering tends to find clusters of comparable spatial extent (density clustering). They try to enhance the clustering process by selecting nodes as clusters that are centric and have a high level of energy. This will give the same QoS results as given by the K-means approach with a reduction of energy consumption and a prolongation of the lifetime of the sensor network. For the simulation purposes, the authors have implemented our approach on the OLSR routing protocol. The approach proposed seems to give better results than the MaxMin approach.
In [10] an improvement of protocol OLSRMaxMin2c was proposed by the introduction of the cost of energy. The main objective of this new contribution, called OLSRMaxMin2C / Energy, is the optimization of energy consumption OLSRMaxMin2C/Energy. It consists of dividing the network into clusters. Cluster heads are elected according to their IDs and their residual energies (battery levels); this algorithm determines It is an algorithm that optimizes the number of live nodes by always choosing the appropriate nodes for each task in the network.

PROBLEM FORMULATION
The development of technology and the revolution of wireless technology have led to the existence of mobile devices in all areas of human activity. The connection of these mobiles in ad hoc mode becomes a necessity, by their simplicity of deployment and in the absence of a preexisting and expensive infrastructure. When the number of mobile increases and the topology changes rapidly by the high mobility of nodes; such a proactive routing protocol cannot support the network evolution, due to the generation of more message control and routing table, etc. Consequently the network becomes more sensitive, and it doesn't ensure a minimum quality of service. Developing or refining methods that will overcome these obstacles becomes a necessity. Many studies on routing optimization adopt the clustering method to reduce the costs product by density and mobility. It is based on the distribution of mobiles nodes into groups. This approach is proposed to reduce the storage and processing information within a cluster. In the literature we can have several clustering techniques; the K-means clustering technique is among the best methods known [11,12] used in MANETs. This one will allow clusters to be more stable and the center will be better chosen. Choosing a less mobile and dense cluster head without taking into account their residual energy level; can result in the isolation of all cluster members from the rest of the network, in the case of the exhaustion of the battery [13]. The implication of the concept of energy in the process of clusters construction avoids the election of a node, well placed but with weak energy level.
In this paper we propose the use of K-means to produce optimal clusters. This method is based on the use of Euclidean distance to generate cluster centers. In our proposition we will introduce in the K means algorithm the three stabilization parameters (mobility, density, energy) as a distance vector. The cluster head will be elected according to these three parameters. The center of the cluster will be the one that respects both the three stability parameters in it group, it will be less mobile, denser and has an energy level that increases the life of the cluster. This approach allows to build the clusters that are highly resistant to the constraint structure of the ad hoc network. It is implemented to standard OLSR protocol giving birth a new protocol named OLSR-Kmeans-SDE.

CLUSTERING
In Spatial clustering algorithms can be classified into four categories. They are the partition based, the hierarchical based, the density based and the grid based [14,15]. According to [16] clustering in ad hoc networks can be defined as a theoretical arrangement of dynamic nodes corresponding to one or more specific properties in different subsets called "Cluster". An element of a cluster is characterized by a strong similarity of the components of its group, and a strong dissimilarity with respect to the members of other groups [17]. Each cluster is identified by a particular node called "Cluster head". Clustering allows a node to store only part of rather than all the information of the network topology. This simplifies the processing of the global topology [18]. This reduces the size of routing tables and thereafter the reduction of control messages generated by the routing system [19].
The use of clustering in MANETs has several advantages [20,21], usually a cluster structure allows the node to play one of three roles:  Cluster head: A cluster head is elected in the cluster formation process for each cluster. Each cluster should have one and only one cluster head.  Gateway: A node is called a gateway node of a cluster if it knows that it has a bidirectional or unidirectional link to a node from another cluster.  Members: All nodes within a cluster except the cluster head are called members of this cluster.

DESCRIPTION OF APPROCH CLUSTERING ALGORITHM
In the absence of any assumption about the distribution of nodes in a mobile ad hoc network, an unsupervised classification of nodes into classes is required. We propose a model based on geometric considerations (grouping geographically close nodes). This proposal requests define a measure of the proximity between nodes. We begin by introducing the concepts of similarity, dissimilarity and distance between nodes, and then we will use the inertia concepts intra-class and inter-class to demonstrate mathematically how a classification method built homogeneous and distinct groups by translating it into a simple minimization problem.
= ∅ for (a, b) such that a ≠ b The property (3) expresses that the clusters formed are disjoint; each object of X belongs only to a single cluster of C. To define the homogeneity of observations set, it is necessary to measure a similarity between two observations. Then we introduce the concepts of similarity and dissimilarity. Dissimilarity is a function d which associates a value in IR + to each pair (x i x j ) such that: ( , ) = ( , ) ( , ) = 0 → = Conversely, another possibility is to measure the resemblance between observations using a similarity. Similarity is a function s to which associates a value in IR* to each pair (x i x j ) such that: Thechoice of the distance is a key issue for classification methods. To offer a relevant measure of similarity between elements, it is necessary to well use the available information at the nodes.
The Minkowski distance is the most used to determine the similarity between elements: Where vk (xi ) is the value of the object xi on the variable vk.
Depending on the values taken by the parameter l, we talk about:  Euclidean distance (l = 2);  Manhattan distance (l = 1);  Chebychev distance (l = ∞). We note that the metrics commonly used to analyze the ad hoc performances such as density, mobility and energy can be used to express distance. Our objective is to divide the nodes into homogeneous and distinct clusters. To do this, we start from the definition of network inertia that can be modeled by a cloud of points.
Consider a network of n nodes ( x 1 , … x n ) and UG designates the centroid of nodes cloud.
The cluster inertia is by definition the sum of squares of distances from the center: Assume that the network consists of P distinct clusters C 1 , , , C P . Each of these clusters having as centroid U ck . We can then decompose the total inertia of the cloud of nodes as follows: According to the theorem of Huygens,

3719
The first term called intra-cluster inertia calculates the sum of distances between nodes and their centroid. Low intra-cluster inertia indicates that the nodes in the same cluster are more nearer (Clusters are homogeneous).The second term called the inter-cluster inertia calculates the sum of distances between centroids of clusters and global centroid, that is to say the separation degree of clusters.
From a formal point of view, the optimal partition is:  That minimizes the intra class inertia  Or that maximizes the inter class inertia, Thus the optimal partition would be defined as follows: The objective of classification algorithms based on this principle is the search of optimal partition. In practice it is impossible to generate all patterns of clustering for evident reasons of complexity. We then seek a scheme sufficiently close to the optimum. This optimum is obtained in an iterative manner by improving an initial scheme randomly selected by reallocating objects around mobile centers. In order to partition the nodes into clusters, we used this technique (iterative reallocation) based on k-means algorithm.

K-means method
The k-means algorithm [15] is a partitioning method widely used in various application areas. From P separate clusters, the k-means algorithm assigns iteratively objects (x 1 , … x n ) at P centers of clusters (u 1 , … u P ), followed by calculating the positions of the new centers. The stopping algorithm is a criterion fixed by the user and can be:  Achieve a limited number of iterations;  The algorithm converges: clusters formed remain the same between two successive iterations;  The inertia intra-cluster is not improving between two iterations (the algorithm is sufficiently close to convergence). To justify the K-means algorithm in view of our goal aimed to minimize the intra-cluster inertia, we demonstrate that the redefinition of centers of clusters and the reallocation of nodes (Sequence of two sub-stages of K-means algorithm) results in a decrease in intra-class inertia.We start from following considerations at the end of determined step j;  Centers of clusters (u 1 j , … . u p j ) have been calculated;  The classes (C 1 j , … . C p j ) were obtained by assigning at the center (C k j ) the n j i nearest nodes.We define the following quality at the end of step j; The redefinition sub-step of new centers and reassigning of nodes (next iteration) requires: a. Recalculate centers of clusters (u 1 j+1 , … . u p j+1 ) based on points belong to each of clusters (C 1 j+1 , … . C p j+1 ) possessing respectively n k j element. We have: where u k j+1 the center of the cluster isC k j et W j+l is the inetria intra-clusters clusters assosiated to clusters(C 1 j , … . C p j ) . We will have: being the null vector by definition of u k j+1 b. Reassign nodes to the nearest centers. Then we obtain new clusters (C 1 j+1 , … . C p j+1 ) and we define: After reassignment all distances decrease because each node xi is assigned to the cluster center minimizing thereby the gap: ‖ − +1 ‖ 2 We therefore have: Thus, for each j we have proved the following inequality: So we have in particular: This shows that after each algorithm iteration the improvement of the nodes classification is sure within the meaning of I intra criterion. Because the intra-cluster inertia of the optimal partition is the smaller, the margin of improvement is finite. This implies that the algorithm converges inevitably. The disadvantage of the k-means method is that the number of clusters is a parameter of the algorithm. This is not obvious for a mobile ad hoc network where nodes can join or leave the network randomly. Hence the need to make improvements to the method to be applicable to our case:  Initial number of Clusters: At the beginning each node represents its own cluster. Thereafter we make a series of ascending partitions by combining the nodes belonging to the same neighborhood into same cluster until reaching a stable number of clusters.  Distance: We can use the density, mobility or energy of nodes to express the distance metric.

METRIC MOBILITY
The metric [22] proposed allows to calculate the stability of a node based on four parameters, the node that leaves the coverage area of node study, the nodes that joins the area zone, the node approaches to the studied node and the node that finally moves away from the examined node and that stays in their coverage area. The first two parameters will be retained by the collected control messages. The last two will be calculated by the calculating power for two successive received messages (eg Hello message in OLSR protocol). We define the following parameters that characterize our metric:  N con defines the number of nodes that converge on the studied node.  N div defines the number of nodes which diverge towards the outside of the node studied.  N in defines the number of nodes within the area of the studied node.  N out defines the number of nodes out of the coverage area of the studied node Following these four types of movement we have created a metric that will rank the nodes between any of these four metrics of stability. We define each as as shown in Table 1.

3721
In the four classifications, we have the first case that reflects a better stability for the node in question and the latter which represents a poor stability of the studied node, intuitively the second is better than the third, because N in is greater than N out and secondly, even with N div >N con the diverging node stay in the coverage area of the studied node. We determine subsequently metric degree of stability that will calculate for each category the best stability node, we use in the formula (7) the coefficient γof flow defined in [23], we divide the coefficient of 4 intervals as shown in Table 1, and metric of stability degree of node i will be as follows: The stability metric is defined in our proposition formula (7) [23]. The disadvantage of this approach is that the parameter must be fixed between three values ( 0. 25

DESCRIPTION OF ALGORITHM
We present below our algorithm in detail (k-means improved to make it applicable to the partition of nodes into clusters): The introduction of changes to the standard algorithm is a need, to use the k-means method for grouping nodes of a Manets into clusters; this change aims to determine the K parameter of the algorithm. In first it is assumed that each node in Manets is an own cluster; then followed by a sequence ascending partitions, in the end we reunite the nodes from the same neighborhood in the same cluster, until reaching a final number of clusters. Thereafter, the K-Means algorithm will be used to generate more stable clusters with their cluster head. The parameters used in the K-means calculation algorithm are the stability, density and energy residual of a mobile, so each node is identified by a vector (stability, density, energy). This vector will be the basis for expressing the distance between nodes.
We present our algorithm based on K-means to create the partition of the cluster nodes in Manets. Proposed algorithm  Input: Mobile ad hoc network of n nodes.  Output: Network virtually partitioned into P clusters Step 0  Initialization with n centers ( 1 0 … 0 ) each node is a cluster  Creation of an initial partition P0 = {( 1 0 …. 0 )}  Initialize to 1 ( ). Assign to 0 it is two-hop neigbours;  0 = { ∈network | d ( , 0 )>=2hops};  Remove from list of centers the 0 nodes assigned to centroid 0 ;  Move to the center l+ 0 ;  Repeating steps b to do until all node are affected;  Calculation of new centers of k cluster obtained ( 1 1 … 1 ): nodes having the optimal values of (stability, density, energy); Step t  Creation of new partition Pt = {( 1 …. )} by assigning to each centers its two-hop neighbors;  The centers affected to other centers are removed from the list of centers;  The isoled nodes are assigned to the list of centers;  Calculate the centers of k clusters obtained ( 1 0 … 0 );  Repeat steps 3 to 6 until that a stable partition is achieved (structure of partition + equals that the + +1 ) or reach n iterations. We used the NS2.34 simulator to analyze the quality of service between OLSR Keans-SDE and the two other protocols (The classical Kmeans implemented in OLSR and OLSR Med+ [24]). We validate this analysis by studying the behavior of the protocol in terms of PDR, EED and Overhead [25]. The following Table 2 shows the simulation parameters.

. Packet delivery ratio
We notice in the Figure 1, that when the network becomes dense and large the number of lost packet increases, and this is because of different phenomenon (topology change, battery exhaustion ...); the retained structure and refined algorithm in the our protocol proposed OLSR Kmeans-SDE allows to reduces the number of packets lost compared to our previous version.

End to end delay
Today, the real-time applications are invading the field of human activity. For this to minimize the delivery time of the packets between the transmitter and the receiver becomes a necessity. The method of clustering adopted and refined led us to have a small gain of transfer time (EED). As we see in the Figure 2, the End to End Delay (EED) of all routing protocol increase proportionally with the number of nodes; and the OLSR Kmeans-SDE responded the best.

Overhead
The overhead of three protocols are according to the network density in Figure 3, The OLSR Kmeans-SDE generates more overhead followed by OLSR Kmeans and OLSR Med+. It is noted that the division of nodes into groups allow a significant reduction in the controls messages broadcasted when the network becomes denser; however our protocol proposed OLSR Kmeans-SDE have more information to exchange (mobility, density, energy). For this we notice the small overhead compared with other protocol proposed.

CONCLUSION
The clustering is one of the most important techniques used to organize the network into different groups, this can reduce the complexity in management of nodes and therefore simplify the processes routing information and increase the Qos in MANETs, we have used the strong method (K-means) to group the nodes into several clusters. We have confirmed that k-means improves the classification of nodes by demonstrating theoretically that the method reduces the intra-cluster inertia (therefore it increases the intercluster inertia) between two successive iterations. In this paper we have continued to refine our proposed protocol by adding more stability parameters. The routing protocol (OLSR Kmeans-SDE) brings some improvement to the level of EED and PDR, having a product that meets all the needs in terms of quality of service is not obvious. We win one type of performance and we lose the other. We started our study by implementing both clustering techniques (k-means, k-medoid), and we added metrics in the algorithms to have a robust protocol. That's what we achieved during our research. In future work, we will have a plan to minimize the overhead in OLSR Kmeans-SDE.