Comparative analysis of routing techniques in chord overlay network

Received Apr 3, 2020 Revised Feb 25, 2021 Accepted Mar 4, 2021 Overlay networks are not a new field or area of study. This domain of computing will someday drive P2P systems in various application areas such as block-chain, energy trading, video multicasting, and distributed file storage. This study highlights the two widely known methods of routing information employed in one of such overlay networks called chord. In this study, simulations of both routing modes (iterative and recursive) and their variations under no-churn (leaving and joining of nodes) and churn conditions was carried out. The routing parameter (successor list size) was varied for each of the routing techniques in a simulation study. The results obtained show that semi recursive routing gives a better routing performance under churn scenarios.


INTRODUCTION
Overlay networks are application layer to application layer networks whose traffic is transmitted from point to point through an underlay network like the internet for its communication. The implication of this tiered approach is wholesale applications built with new capabilities and more redundancy than would be practicable using only an underlay network. Some of these applications include distributed file storage systems, video multicasting, and mobile P2P file-sharing applications. However, there are the peculiar challenges that such systems face, which include Byzantine faults (erratic behaviour of nodes) [1], [2], Sybil Attacks [3], [4], and instability due to the effects of Churn [5]- [8]. Churn in overlay networks refer to the joining and leaving of nodes in the overlay network. It's important to note that overlay nodes are not failureproof and may join and leave (temporarily or permanently be non-visible to others in the overlay) due to loss of connectivity, low power, and network congestion of underlay components. The churning effect is of interest in distributed networks because it invariably affects the availability of services on the overlay platform. Nodes may be responsible for information on the overlay platform (assume each node is responsible for say some group of files or data). The study looks at how nodes in the overlay maintain their connectivity in a network, i.e., the routing techniques employed to keep nodes as part of the system. With the expected effect of churning, we also explore how this affects network performance. Also, the effect on network performance when node availability is varied is explored. We outline the various routing techniques used in most overlay networks. Afterwards churn models utilized are then discussed. Iterative routing are routing that require a response from the nodes receiving the query. When no parallelism is employed, it implies that the requesting node must wait for a response from the next hop-node before another message is sent. Concerning this routing mechanism [9], there are two variations, namely: 1) Normal Iterative Where the number of messages required to find information is equal to 2 N +1. Where N is equal to the number of nodes in the lookup path, as illustrated in Figure 1(a), see in appemdix.

2) Exhaustive Iterative
In the case of exhaustive iteration, all nodes in the neighbourhood of the nearest finger node are queried till the best successor of the information is found. This requires far more messages, as illustrated in Figure 1(b), see in appemdix. b. Recursive routing Recursive routing does not require a response message from nodes in a specific lookup path. Instead queried nodes can simply forward request to the next-hop node. This suggests a level of independence of the nodes on the lookup-path. A number of its variations are discussed below: 1) Source routing or symmetric The response message from the destination node traverses the same path as the request message. This is shown in Figure 1(c), see in appemdix. Nodes have to recall the nodes from which the specific request message was received from.

2) Full or forward-only
Full recursive routing requires that both request and response message follow an independent path. The path taken by the response message is also recursively gotten, this may mean that the message goes the full length of the ring. This is shown in Figure 1(d), see in appemdix.

3) Semi or direct response
The response message is sent directly to the initiating node. This implies that the routing decision not be made by other nodes on the overlay after the request arrives at the responding node, as shown in Figure 1(e), see in appemdix. However, this is rarely the case due to connectivity issues related to network address translation traversals. c. Churn models Churn introduces dynamism in Overlay networks. Nodes and terminals in real-life networks are free to join and leave at any point in time during its operation. This is so in P2P networks over the internet, and Mobile P2P networks. However, in analysing such churns, we must consider them as either random or probabilistic functions. The individual lifetimes of nodes participating in an overlay network can be modelled using an array of functions and distributions. This includes random, Weibull [10]- [12], Pareto-II and exponential distributions. 1) Random functions mean that no specific pattern of churn and can be used in adding and removing nodes to the network at any positions in the range of the network. This is denoted by specifying that a random node ε specifying the node churned ranges from 0 to N-1 (total number of nodes) where ε is any node from the above distance. Simple computer-generated random functions determine the leaving and joining of nodes in the network bounded by the node length of the network. 2) Probabilistic functions or distributions, on the other hand, allow churn in an overlay network to be modelled using probabilistic distribution functions (PDFs). These models are represented by the probability of a node X to be churned in a network. Two PDFs dominate the current literature on churning in overlay networks, namely Shifted Pareto and the Weibull distribution.

3) Shifted Pareto distribution
The Pareto probability distribution function of the second kind [13], [14] is expressed in (1) and (2). or where α (the negative slope of the straight line) is called the Pareto coefficient and b is the scale parameter. The cumulative distribution function or density function is given as: The larger α is, the less unequal is the distribution. Of interest, as relates churn in the overlay network is the mean excess function s(x) of a node in an overlay network which is given in equation four as: The Weibull probability distribution function is given as: where is the shape parameter and the scale factor. represents the value at which the function is to be evaluated from [11], [15]. The cumulative distribution function is derived from equation five as: The average node lifetime l is given as: where is the scale parameter, is the shape parameter and Γ as the gamma function of the expression, (1+ 1/ ). The study is structured as follows: Section 2 discusses the methodology and tools employed in comparing routing techniques under various conditions in this study. Multiple churn scenarios are detailed, while routing techniques used in the overlay are discussed in section 3. Section 3 presents the results obtained in our simulations with brief discussions on the outcomes. Section 4 concludes the study.

RESEARCH METHOD
This section deals with simulating the chord overlay network under varying conditions. In this study, OverSim-a discrete event simulator for overlay networks has been chosen. The OverSim framework is dependent on Omnet++ [16]. A number of other studies with respect to overlay networks have used the unique features and configurations in OverSim to achieve various purposes and results [17]- [19]. With respect to previous studies with similar methods, one factor has been repeatedly studied, the Lookup success ratio. This is given as the ratio of valid response to a valid request made in an overlay application. In this study, we utilize another metric called the one-way delivery ratio. Other factors covered include latency (time taken to stabilize network plus the time taken to generate a response for a lookup), bandwidth consumption (the sum total of the traffic of messages on the overlay layer), mean node lifetime, and hop counts per request.
The factors considered are dependent on the study covered. Recent studies have been concentrated on the use of overlay networks in mobile network systems with high dynamism (high churn rates) and limited bandwidth. Other studies have identified specific protocols and pending protocols as application fields to be explored as in the case of resource location and discovery protocol (RELOAD) [20] and internet protocol (IP) multicasting [21], [22]. With respect to these studies, the focus instead is on the on/off duration of nodes in the network (node availability).

RESULTS AND DISCUSSION
The results for Figures 4-6 (see in appendix) have been taken for networks with node availability, i.e. a=0.5. The effects of varying the network parameter are not so obvious in no-churn and random churn scenarios. However, for Lifetime and Pareto-II distributions, some patterns can be deduced.

One-way delivery ratio
Although the lookup success ratio has been more popularly touted to measure overall performance, this study uses the one-way traffic indicator instead. As this study does not seek to make some recommendation amongst overlay networks, the one-way delivery ratio would seem sufficient for an obvious comparison between routing algorithms. The one-way delivery ratio shows no marked improvement and in fact, seems to degrade for some routing mechanisms with an increase in the routing parameter. From Figures  5(a) and 6(a), see in appendix, as the ratio improves as r tends to 4, the ratio improves marginally. This improvements seem to Plateau and start to degrade slightly for the range 4≤ r ≤8.
Also, noticeable differences in performance are observable between recursive routing algorithms and iterative routing algorithms in both Pareto-II and Lifetime Churn scenarios. Under no-churn situations, recursive algorithms outperform iterative algorithms. Under Pareto-II and Lifetime churn scenarios, iterative algorithms supersede recursive algorithms.

Bandwidth and maintenance message costs
The minimization in bandwidth consumption for r>4 mirrors a similar minimization in maintenance cost usually associated with keeping the routing tables of a Distributed Hash Table [23], [24]. All observations are shown from Figures 3(b), 3(e), 4(b), 4(e), 5(b), 5(e), 6(b) and 6(e), see in appendix. This also reflects in the average packets sent per node, as shown in Figures 3(f), 4(f), 5(f) and 6(f), see in appendix. It is noted that the overall effect of increasing successor list size is the improvement of the bandwidth utilization performance of all network scenarios as can be seen from Figures 4(b), 5(b), 6(b) and 8(b) see in appendix, by significantly reducing the bandwidth consumed for successor list sizes in the range r>4.

Average hop counts
The average Hop count would generally decline as the logarithmic ratio between the size of the network and the routing table size reduces. In terms of its influence over the routing performance as reflected through the delivery ratio from Figures 3(d), 4(d), 5(d) and 6(d) see in appendix, this may be interpreted as minimal. In close observation, the correlation between the slight degradation in routing performance (oneway delivery ratio) and the average hop count far steeper decline imply that other factors may weigh in far more heavily than the Average Hop count.

Latency
Latency here measures the delay in between a request from an originator and its arrival at its intended destination. Iterative schemes show a larger latency than recursive schemes. For the Lifetime and Pareto-II churn, with increased successor list sizes, latency reduces significantly.

Node availability
In networks with significantly high availability, i.e. li >> di with a≥0.8, we see a significant gain in bandwidth consumption as well as an improvement in the lookup success ratio as shown in Figure 5(b) and Figure 6(b), see in appendix. When the network is characterized by lower availability, i.e. li << di, it is noticeable that the lookup performance degrades also. The Hop Counts required and latency remain relatively unchanged even when availability is varied. Several studies on churn [24], [25] in the public domain use several other metrics to evaluate overlay network performance.

CONCLUSION
The current study has only covered the effect of modifying the network parameter (successor list size) over the given scenarios. This has shown that as r → ∞ and as mean network size n→∞ that lookup performance l→1 for a series of scenarios. This holds, however, as node availability a→1 in which case as node's average lifetime gets larger than the average dead time, routing efficiency improves. The semirecursive routing overall gives the most efficient performance across all given scenarios. It offers bandwidth efficiency, reduced latency, lower hop counts and steady lookup performance. However, it is expected that this will be difficult to implement over the internet because of NAT related connectivity issues. Future studies will cover the effect of mean search delay on the routing efficiency and lookup performance. Already several studies on churn in overlay networks are in the public domain, and the expectation is that this would continue to be the trend.