Congestion bottleneck avoid routing in wireless sensor networks

ABSTRACT


Wireless sensor network as a graph
The WSN is represented by a planar graph G(V, E) where V is the vertex set of N nodes and E is the edge set. The sensor nodes are identified and represented by the vertices 1, 2,.., N. The vertex or node set V is given by, The edge set E is the collection of all the links of the network. Edge element e(i, j) represents the edge between node i and j for all i and j in the range 1 to N. The edge value e(i, j) is set to 1 if node i and j are within the transmission range of each other. Otherwise, e(i, j) is set to ∞. Therefore, e(i, j) = 1 means nodes i and j are one-hop neighbors and there is connectivity between them and e(i, j) = ∞ means, i and j cannot communicate directly. Only connecting edges will be shown in the graph. One hop neighbor nodes are also called adjacent nodes. Here, bidirectional links are assumed between one-hop neighbors. Therefore, e(i, j) = e(j, i). We take e(i, i) = ∞ to avoid self-loops. The collection of e(i, j)'s form the adjacency or connectivity matrix for the graph of the network.

Path from the source to the destination
A path from a source node s to a destination node t is a sequence of non-repeating adjacent (one hop) nodes starting from s and ending at t. Non repetition of nodes assures that the path is free of loops. Adjacency of nodes along the path assures continuous connectivity from the source to the destination. There can be several paths from s to t in a given graph (network).

Measure of congestion level at a node
Several metrics are used to measure the congestion level or the degree of congestion at a node (Akyildiz et al., 2002). Without any loss of generality, we take the queue length of packets at a given node as the measure of the congestion level at that node. It is assumed that the sizes of buffers at nodes are sufficiently large so that there is no loss of packets in any queue due to overflow. It is also assumed that the queue lengths change relatively slowly with respect to time so that during the calculation and rediscovery of the optimal paths, the queue lengths remain nearly constant. In general, the congestion levels of the sensors nodes remain almost same in a session. The centralized controller, BS collects this information periodically. This period depends on the applications and characteristics of the network. The present congestion level of node i is represented by symbol for i = 1 to N. The congestion array for the entire WSN is written as,

Congestion levels along a path
Once, 's are known for all the nodes, consider all possible loopless paths from a specific source node s to a given destination node t. Let the number of distinct paths from s to t be M. Let path j be represented by Pj as, for j = 1 to M. In (3), L(j) is the total number of intermediate nodes along Pj and , is the k th intermediate node of path Pj for k = 1 to L(j) with , ∈ V. In Pj all the nodes along the path are connected. That is, the corresponding elements of the adjacency matrix are 1's as, for k = 1 to L(j) + 1. In (4), ,0 = and , ( )+1 = . The (4) simply means that any two adjacent nodes in Pj are within the communication range of each other. The Congestion array (or vector) of a path is the sequence of the congestion levels of the nodes along that path. For the path specified by (3), the array that represents the congestion levels is represented by CL as, Here, the source and the destination nodes are fixed (specified) for all possible paths from s to t. Since the packet stops at t, the congestion at t is not relevant for the packet travelling from s to t. Therefore, term Qt in (5) can be ignored. In defining the effective Congestion Vector for the purpose of determining the optimal path, we exclude from (5). The resulting effective congestion vector is, = [ , ,1 , ,2 , … , , , … , , ( ) ] Here, , is the congestion level of node , in a proper unit and k varies from 1 to L(j). That is, , = , . Thus , ∈ .
Example 1: To demonstrate the formations of 's and 's, a simple network is shown in Figure 1.
Here, source s = 1 and the destination t = 5 with number of nodes N = 5 and the number of distinct paths, M = 4. Congestion at source is taken as 0, which will be explained later.

Maximum congestion level of a path
The maximum congestion level of path Pj, represented by variable Rj, is defined as the maximum of array . That is, The (7) also can be expressed as, This gives the maximum of , over k in the range 1 to L(j). Thus Rj is determined by finding the maximum of the congestion levels of nodes forming the path excluding the source and the destination nodes. In Example 1, R1 = 40. R2 = 40. R3 = 30. R4 = 40.
Our objective is to find that index j, say J, where RJ is the minimum of the array R. This can be stated as, In Example 1, the min(R) occurs at index location 3. Therefore J = 3, RJ =30 and the optimal path is PJ = P3. Substituting for Rj from (8) in (10), we get, Once J is obtained, the corresponding minimum among Rj's is RJ (the J th element of array R) and it can be expressed as, Once J is known, the optimal path from s to t is PJ as specified in (3). This path is designated as OP(t). The values of J, PJ and RJ for a given source s depend on the destination t. Therefore, we designate J as given by (10) as J(t) and the corresponding minimised maximum congestion level value RJ as f(t) . Then we rewrite (10) and (12) as, That is, f(t) can be written as, When we select the optimal path OP(t), the relatively higher congestion level nodes are avoided while travelling from s to t. The variable f(t) from (15), represents the maximum congestion level of path OP(t) from s to t.

Objective
The objective is to determine f(t) and the optimal path OP(t) for a given s and for all t's in (t∈{1:N}\s) . We designate this path as the Congestion Bottleneck Node Avoid Path (CBNAP) and designate the method to determine CBNAP as the CBNAP algorithm.

DETERMINATION OF CBNAP 4.1. Exhaustive search method
In general, for a given WSN, by knowing its topology, we can enumerate all possible paths from a given source node s to a destination node t. Along each path, we can find that node which has the highest congestion level among all the nodes of that path. This gives the maximum congestion level of that path. After determining the maximum congestion level of each path from s to t, we can select that path which has the least maximum congestion level value. But this method is NP hard, because the number of possible paths increases exponentially as N increases. Therefore, the dynamic programming approach is adopted to solve this problem.

Dynamic programming approach
As usual, the source node is denoted by s. Let t be any other node reachable from s with OP(t) as the optimal path from s to t and f(t) as the minimized maximum congestion level value of that path.
. The sub-optimal problems are solved recursively.

Effective congestion at source s
Whatever the actual congestion level at s, all the paths have to start from s. There is no other option. The congestion level Qs is common for all paths starting from s. Therefore, for the purpose comparison and calculation of the congestion levels of the paths, effective Qs is set to zero as, The minimized maximum value of Qs is also 0. Therefore the corresponding f(s) = 0.

f(t) values for a one-hop neighbours of s
One hop neighbours of s are shown in Figure 2. Here, t1, t2,…, tK are the one hop neighbours of s.

Figure 2. One hop neighbours of s
Since, t1 is directly connected to s, path (s, t1) is a single link (single hop) path. The minimum as well as the maximum congestion level of path (s, t1) is Qs itself which is zero as given by (16). Therefore, This relation holds good even when we have a two hop path from s to t1 through an intermediate node i as shown in Figure 4. Here, we have two paths (s, t1) and (s, i, t1). The corresponding maximum congestion levels are.
The (18)   The (18) and (19) can be combined to state an important property of f(.) as follows. Property 1: When a node i belongs to the one hop neighbour set of s, then, Thus, ( )'s of one hop neighbours of s are directly calculated using (20). Let the one hop neighbours of s be grouped into a set designated as A0. That is, Then, for i ∈ A0, the values f(i)'s are 0 as given by (20). Starting from the known values of f(i)'s ( for i ∈ A0), the values of ( )'s of other nodes ( i ∉ A0) are calculated using the principle of dynamic programming.

Calculation of f(t) by dynamic programming for any t
All the nodes of the network are grouped into two disjoint sets designated as A and B. Set A holds those nodes whose f(i)'s have been already calculated and are known. Thus the optimal paths OP(i)'s are known for i ∈ A. Nodes in set B holds those nodes whose f(i)'s are not known and yet to be calculated. Unknown and uncalculated f(i)'s are initialized to ∞ so that they are excluded while calculating the minimum as explained later.

Initialization of f(i)'s
The calculation of f(i)'s for all i's is an iterative process. Initially, the one hop nodes of s are calculated to get A0. The f(i)'s for i ∈ A0, are set to 0 and f(i)'s for i ∉ A0, are set to ∞. These are the initial values of f(i)'s. Initialization operations can be called as iteration 0. For the first iteration, the previous iteration is taken as iteration 0. The values of f(i)'s for iteration 0 are the initial values which are already known. In the first iteration, consider a target node t that belongs to B. Now ( ) is ∞. Let M be the number one hop neighbour nodes of t which are also members of set A. Let these nodes be denoted by i1, i2,…, iM as shown in Figure 3. That is ik ∈ A and e(ik, t) = 1. If M = 0, then the next node from set B is taken as t and again M is determined. This process is repeated until M becomes greater than zero. In general these neighbour nodes will be all around t. But for the purpose of ease of explanation, they are shown in a single column to the left of t. Congestion level Qi,k of ik are also marked in Figure 3 for k = 1 to M.
Once g(t) is calculated, it is compared with the existing value of f(t) (from the previous iteration) and the present f(t) is updated only if g(t) is less than f(t). That is, Now t is removed from B and added to A. Set A grows and B contracts. Now next t from B is taken up and f(t) for this t is updated as given by (27). Once all the elements in B are covered, (B goes empty), the present iteration is over and the next iteration starts. In the next iteration, same process as in first iteration repeats but with the updated set of f(i)'s.

Stopping criterion
During successive iterations, f(t)'s are updated according to (27). To express this in a compact form, let the collection of all f(t)'s for t = 1 to N for a given s, be represented by the array F as, Then, F is updated in successive iterations. The theoretical maximum number of iterations is (N−1) [15]. In practice, if F does not change from the previous to the present iteration, then F will not change in further iterations too. After this, the process can break out of the iteration loop and there is no need to complete the (N−1) theoretical iterations. To facilitate the termination of the iterations, the value of F at the end of the previous iteration is stored in Fold. At the end of the present iteration, the updated F is compared to Fold.
If F= Fold, the iteration loop is terminated. The optimal path is obtained using the predecessor vector pred of size N as usual [15]. Determination of f(t)'s and the pred vector is described in Algorithm 1. This is basically a centralized algorithm. But can be converted into its equivalent distributed algorithm. The algorithm is a modification of Bellman-Ford shortest path algorithm [15]. If F = = Fold Break (Go to 11) endif 10. Store F in Fold for the comparison in the next iteration as, Fold = F; Endfor h //end of h loop 11. Over Once pred(t) is ready for t =1 to N, the corresponding OP(t)'s are easily obtained [15] using the function get_TS(…) as given below. function TS = get_TS(pred, t, s)

TS=[t]; while t~=s, TS=[TS,pred(t)];t=pred(t);end
Vector TS gives the path from t to s. Path OP(t) which is the path from s to t is obtained by reversing the sequence TS. Algorithm CBNAP in association with function get_TS(…) gives f(t)'s and the Congestion Bottleneck Node Avoid Paths, OP(t)'s, from s to all other nodes. Once f(t)'s are determined for t = 1 to N for a given s, those high congestion level nodes whose Q's are greater than max(F) are excluded from participating as intermediate nodes in the routing process in the present session. Thus these bottleneck nodes are avoided acting as intermediate nodes during the discovery of the optimal path. Here, OP(t)'s are the optimal paths from BS to sensors and the reverse of OP(t)'s provide the optimal paths from each sensor to BS.   Table 1, for t = 1 to 10. For lack of space, column heading Up date is represented by Ud and the variable pred(t) by P(t) in Table 1.
We can see the updated values of f(t) after each update.
After update 9, the F vector is same as that of update 8. That is, F = Fold and then the process converges. In this example, the main outer loop starting at step 8 of CBNAP algorithm terminates after 3 iterations. Here, max (F) = 15 and the bottleneck nodes having Q values greater than 15 are nodes 2, 4, 6, 7, 9. These nodes are excluded from acting as intermediate nodes in any optimal path originating from s. From Table 2, it can be clearly seen that nodes 2, 4, 6, 7, 9 are absent as intermediate nodes in all the optimal paths. 1

COMPARISON WITH OTHER METHODS
Congestion avoid route can be determined by the simple 'GREEDY' method. The basic idea here is, starting from the source, to select the least congested neighbor node as the next forwarding node until the final destination is reached. Greedy method is invariably sub-optimal, because it will not foresee all possible alternatives. But it is fast. Another available solution is to use 'TARA' [9] to alleviate congestion in WSNs. Our Method CBNAP is compared to 'TARA' and 'GREEDY' methods.

Time complexity
The time complexity of CBNAP is O(N 3 ) [15]. The experimental completion time taken to get the optimum result for successive values of N is shown in the graph of Figure 5. Here, the number of edges and the adjacency matrix are randomly generated. The congestion level values of nodes are chosen using uniform random distribution. From Figure 5, it can be seen that, the time taken to generate the optimal solution increases as the number of nodes increases. The GREEDY method is faster compared to CBNAP while TARA is slower.The time saved in CBNAP is about 15-20% when the number of nodes is in the range 160-180.

Load Balance Index
Load balancing is an effective technique for congestion control [16][17][18][19][20]. CBNAP selects path with lower congestion levels. Therefore, when communication takes place using this path, the overall load balance improves because only the less congested nodes carry the present load. The fairness of load balance is measured using the load balance index (LBI) [16]. LBI is defined as, where Qi is the congestion level at node i, for i = 1 to N.
When the loads are perfectly balanced (Qi's are all equal) the LBI is one. On the other hand, LBI is low when the distribution of congestion levels is highly skewed (unbalanced). In the simulation experiment, the minimum congestion level is kept constant in each trial. The Maximum Congestion Level (MCL) of ISSN: 2088-8708  Congestion bottleneck avoid routing in wireless sensor networks (Sanu Thomas) 4813 the network is successively increased in steps of 100 and the corresponding load balance indices are calculated for CBNAP, GREEDY and TARA algorithms. The MCL value represents the degree of unbalance of the pending traffic loads at nodes. The simulation result is shown in Figure 6. Here, the LBI's are very nearly same at lower values of MCL and LBI's diverge at higher values of MCL. From the plotted results, it can be seen that CBNAP provides better load balance. This is because CBNAP utilizes lower congested nodes for constructing the paths, even though the path length may be longer. Thus CBNAP achieves better LBI.

CONCLUSIONS
A centralized algorithm has been described for finding the minimized maximum congestion level paths from a given source to every other destination. Those bottleneck nodes having higher congestion levels are excluded from acting as forwarding nodes. This centralized algorithm can be converted into an equivalent distributed algorithm that can be implemented at individual nodes. The proposed technique can be applied to determine the minimized maximum delay and maximum cost paths, maximized minimum remaining energy path and so on. This technique can be adopted by the vehicular traffic control system at metropolitan cities to avoid densely congested junctions for smooth flow of automotive traffic.