Bio-inspired route estimation in cognitive radio networks

Received Mar 11, 2019 Revised Nov 30, 2019 Accepted Dec 10, 2019 Cognitive radio is a technique that was originally created for the proper use of the radio electric spectrum due its underuse. A few methods were used to predict the network traffic to determine the occupancy of the spectrum and then use the ‘holes’ between the transmissions of primary users. The goal is to guarantee a complete transmission for the second user while not interrupting the trans-mission of primary users. This study seeks the multifractal generation of traffic for a specific radio electric spectrum as well as a bio-inspired route estimation for secondary users. It uses the MFHW algorithm to generate multifractal traces and two bio-inspired algo-rithms: Ant Colony Optimization and Max Feeding to calculate the secondary user’s path. Multifractal characteristics offer a predic-tion, which is 10% lower in comparison with the original traffic values and a complete transmission for secondary users. In fact, a hybrid strategy combining both bio-inspired algorithms promise a reduction in handoff. The purpose of this research consists on deriving future investigation in the generation of multifractal traffic and a mobility spectrum using bio-inspired algorithms.


INTRODUCTION
The implementation of new technologies in Cognitive Radio Networks (CRN) should require little computational complexity since the estimation of detection, decision, division, and mobility should not take more than fractions of a second. A spectrum handoff occurs when a Primary User (PU) requests service in a channel that is already occupied by a Secondary User (SU). Moreover, the SU must leave this channel and look for an available one. This process goes on until the SU finishes his transmission. A spectrum handoff has a negative impact on the performance of secondary users in terms of delay and link maintenance. Hence, the priority is to reduce handovers in the system [1]. A CRN is a system that allows the evaluation of the transmission medium, analyze the transmission parameters and make decisions in a dynamic timefrequency space. Based on the allocation and management of resources, it aims to improve the use of the electromagnetic radio spectrum [2]. Therefore, a CRN should be smart and be able to learn from its interaction experience with the RF environment. According to this statement, the learning process is a crucial component that can be tackled from various areas of knowledge such as artificial intelligence, machine learning, evolutionary algorithms or robust control methods [2].
The management of the spectrum under a CRN involves four stages that explain the interaction between PU and SU, in terms of occupying and sharing the spectrum [3]. Spectrum detection is the process in which the SUs look for available bands, capture their information and detect gaps in the spectrum. The spectrum decision-making process is encouraged to assign a channel by considering the spectrum availability and allocation policies. The spectrum division coordinates the allocation of spectral spaces and prevents users from crashing into sections of the spectrum or even overlapping when multiple SUs wish to access the spectrum. Spectrum mobility is the process of mobilizing secondary users towards available areas of the spectrum when a PU requests access [4]. The different problems have been divided according to research methodologies for the study of CRN: spectrum division, spectrum decision, spectrum mobility and spectrum detection. Spectrum detection is a critical aspect of spectrum inference applications since they aim to explore inactive 'holes'. The emerging paradigms of spectrum detection lie in the time, frequency, and spatial domains. Inference techniques are widely used to determine as many empty channels as possible and improve detection performance. They also reduce energy consumption and the time that it takes to change between channels. Other aspects are studied such as centralized allocation of the spectrum, selection of decentralized channels, adaptation of the physical layer and dynamic access to the spectrum.
Spectrum mobility in CRNs have an ambiguous interpretation. On one side, the concept refers to the spectral transfer from one band to another, due to the appearance of PUs or interference evasion, field widely studied in prediction. On the other side, the mobility of Cognitive Radio (CR) and PUs (as seen for instance in vehicular CRNs) can also affect the surrounding spectrum environment regarding the imposition of additional interference or changing the conditions of the channel as well as the spectrum availability. A field that is not heavily studied and has few contributions is the spectrum exchange, which has different understandings in the literature. The concept of spectrum sharing is exchangeable with dynamic access theories, consisting of three paradigms of spectrum use: underlying mode, superimposed mode and interconnection mode. This perception of spectrum exchange gives a meaning that is too broad to cover all aspects of CRNs. Furthermore, spectrum exchange focuses on the underlying mode, which allows CRNs to operate simultaneously in the same band. In this method, flexible thresholds can be established between bands. The inference of spectrum in the division task is mainly associated with the support and mediation between the CRN and the Pus [5]. Figure 1 displays a distribution of inference-related studies in each CRN architecture.

Figure 1. Division of prediction open problems in the CRNs
In March 2012, data from the radio electric spectrum was collected in the city of Bogotá, Colombia, where a spectrum analyzer provides data traffic detection, based on the power of signals. Consequently, the gathered information will indicate whether the signals are present or absent during the sampling period. The captured data is located in the GSM, Wi-Fi and 1850 MHz to 2000 MHz bands [6]. The objective of this paper is to establish a prediction of the radio-electric spectrum using a multifractal algorithm. Afterwards, based on the predicted traffic, a path for secondary users is estimated so that they can transmit without interruption as well as not interlope with primary users. The estimated routes are calculated using bio-inspired algorithms. The remaining sections of this article are divided as follows. Section 2 explains the mathematical foundation needed for the construction of the predictive model and the route estimation. In Section 3, the results obtained for the prediction of Wi-Fi traffic and the route estimation are discussed. Lastly, the conclusions on the overall work are presented in Section 4.

Multifractal prediction
The traditional tool for the modeling and analysis of network traffic is the classical Poisson traffic model, which was later modified and adapted for the study of queuing systems [7]. The measured traffic began to exhibit behaviors that were different from what was expected by the Poisson/Markov models. The data collected in the Bellcore labs paved the way for the study of scale-invariant characteristics from traffic in LAN [8]. Traditional time series models were proven to be insufficient to model the self-similarity present in the traffic and the analysis of such processes called for new techniques. Some models based on monofractal procedures were proposed to outline this traffic [9]. A recent analysis of the measured data has revealed the existence of multifractal scaling behavior [10,11]. A process X(t) is said to have local scaling properties with a local scaling exponent named (t) if the process behaves like X(t)~ (t)(t) as (t)0. For a monofractal process, the scaling exponent (t) = H for all times while the multifractal term is used to denote the processes that show a non-constant scaling parameter (t). The local Holder exponent is given by (t) [5]. If the analysis is carried out it in the scale domain using a Wavelet transform, the coefficients containing the main information of signal X(t) can be estimated using the Discrete Wavelet Transform (DWT) dx(j, k). See (1) and (2) [12]. (1) A fundamental feature of the continuous wavelet transform Tx (a,b) is its redundancy, since neighboring coefficients share some common information regarding X. In fact, in order to reduce the redundancy from the inner product between the signal and the set of dilations (a) and shifts (b) of a mother Wavelet Ψ(•), the DWT is introduced for j scales. The variance of the process dx (j,k) can be estimated as j, since the most important characteristics follow the scaling behavior of the original signal. See (3) and (4) [13].
The Hurst parameter H, can be estimated by calculating the linear regression slope of yj against j. This representation is called the Log-scale Diagram (LD), as shown in the (5).
The estimation of H is useful to study second order statistics (q=2) in stochastic processes. However, the Wavelet transform can be used for both higher and lower order statistics in the real domain (q R). At this point, the extension of j to jq and the estimator of jq are considered. See (6) and (7) [14].
A monofractal process could be described as H(q)=H  q , indicating that the Hurst parameter is the same for all statistics orders. In contrast, when H(q) decreases as the statistical orders rise this means that the process is multifractal. The linear Multiscale Diagram (MD) plots the singularity exponent H(q) of qth order versus q, thereby illustrating the behavior of H for different values of q. The Multifractal Spectrum (MS) plots H(q) versus the singularity dimension D(q). These two variables represent the linear transformation from scales to statistical moments. Therefore, the function that maps the sampling scales with the corresponding statistical moments is non-linear [15]. D(q) can be estimated by using the mass exponent of order q designated as (q). See (8) and (9) [16].
The MS for a pure monofractal process is a specific point in space, in contrast with a multifractal time series where it is a concave curve pointing towards the x-axis. The MS form can be approximated to a second order polynomial function and its width can be measured by zero-crossing said function with D(q)=0. This width is referred to as the Multifractal Spectrum Width (MSW). In [17], the authors proposed a model for traffic generation using a Conservative Binomial Cascade (CBC). Called the Multifractal Wavelet Model (MWM), it is based on the Haar Wavelet transform, using the structure of (10), (10) where A(j,k) is a random variable whose values remain in the [-1,1] interval. To assure that |Wj,k| Uj,k., the scaling coefficients (j,k) have to be positive and symmetric to zero. Moreover, the multipliers Aj,k=2Bj,k, are identically distributed random parameters within the [0,1] interval and symmetric to 0.5. The relationship between the MWM model and the cascades is born from using the Haar transform as a multiplicative cascade coefficient. Said coefficient is established as a random variable, with an average value of 0.5 and [0,1] range. Hence, positive data generation with multifractal characteristics is assured. The MWM starts with the iteration value U0,0, which is distributed for two intervals based on Bj,k and Bj,k-1. Bj,k is a random number with a beta (β) distribution. Then, these values are split into two on the third scale with a different Bj,k for each couple. This process is repeated until the Nth scale is reached, where there will be 2n intervals with U0,0 initial fractions, resulting in a CBC. Therefore, according to (10), the construction of the CBC is given by (11). (11) The authors in [12] proposed the Multifractal-Hurst (MFH) algorithm to generate traces with positive data and long-range dependency (LRD). The MFW obeys to a power law, hence its fractality, which implies adjusting the Hurst parameter and its average value. The Wavelet coefficient multipliers are given by the Beta distribution as proposed in [17] for all scales of the CBC. See (12). (12) By adjusting kh with the desired value of H and equating its value to the parameter P in the multiplicative cascade, a multifractal trace of 2n length is obtained for a given Hurst parameter H and its corresponding average. Finally, the MFH validates the H parameter from the trace using the LD. If the estimated H is not small enough to comply with the confidence values, the trace is thus dismissed and a new one is created. This process is repeated until H remains within the confidence values. Therefore, the construction of the Conservative Binomial Cascade is given by (11) and (13). (13) In [18], the authors proposed the Multifractal Hurst Spectrum Width (MHSW) algorithm to generate positive multifractal traffic with the values of the mean, H and SW being received from the user. The new model distributes two kh around the scales of the CBC, not just one in comparison with the MFH method. The main goal was to adjust the width of the multifractal spectrum at the last stage of the CBC with kh= khw that can modify the multifractal spectrum. The khw is computed with a series of experimental curves that associate the distribution of H in terms of the scales as stated in. (14). The estimation of kh is determined by Hh, which is the Hurst parameter of the final trace. The evaluation of khw is provided by Hhw (js, Ws), a linear function relating the scale where the cascade starts to change (js) and the desired multifractal SW (Ws). Therefore, the construction of the CBC is given by the (15). (15) where (16)   (17)   (18) B1j,k and B2j,k are random numbers with a Beta distribution. The algorithm that describes the MFHW is shown in Figure 2. During each transmission, some channels have gaps, indicating the end of the communication until a new one is begun. These time-frequency spaces can be used to access the spectrum dynamically. This DSA (dynamic spectrum access) is then used by secondary users who need to initiate a transmission. Each time that a user jumps, the handoff increases as well as the energy used for each hop. Consequently, the optimization process involves minimizing the handoff function expressed in the form of. (19).   Where ho(t), represents the number of handoffs detected in a route located by the Evolutionary Algorithm (EA). The transmission will not show discontinuities, and will guarantee a total transfer regarding the service.

Ant colony optimization
Through the use of pheromone trails, ants establish a communication strategy with each other to find the path that leads towards food [19]. Hence, as more ants follow a certain path, the greater is the amount of pheromones deposited on the trail. In other words, the probability of an ant choosing a specific path is increased by the number of ants that have crossed the path. The collective behaviour of ants can be characterized as a positive feedback process, i.e., it is a reinforcement-based process that converges fast as long as there is no limitation in the environment in terms of seeking a solution [15]. Each ant is identified as an agent that leaves a signal in the walked path, influencing future decisions for next agents. Therefore, the set of ants does not converge to a single solution and instead converge to a subspace of solutions where the best one is chosen. The Ant Colony Optimization (ACO) method is based on stochastic processes and is inspired by the social behaviour of ants. It can solve complex optimization problems [20] which is suitable for CRN. Some characteristics such as parallel computing, self-organization, and positive feedback are inherent to the ant colony method, allowing multi-agent optimization to obtain a global solution, thus reducing computation times and complexity [21]. The first bio-inspired algorithm to tackle is the Ant Colony System (ACS), which is based on the behaviour of ants following pheromone traces that lead to food. In this research, the ACS can find routes for the transmission of continuous data. For this model, the objective function (fo) is represented as shown in (20), (20) where p(ij|s p k ) is the probability of channel transition. Therefore, it is inferred that the probability between roads is maximized when the number of ants that have walked on a path is maximum.
The first step in the implementation process is the description of the pheromones spread by the ants when the entire packet of length (l p ) has been successfully transmitted. The pheromone values are updated by all ants once a specific route is completed. Updating pheromone P u (c,t) for channel c and time t is expressed as shown in (21) for an m number of ants with an evaporation rate  ∈ (0,1).

(21)
Δτ k ij represents the ants' attraction feeling to continue on the same channel or hop to another one and is expressed in the (22). (22) where w ho represents the assigned weight when the ant jumps, i is the channel position at time t, and j is the channel position at time t + 1. In the construction of possible solutions, the ants in the ACS cross from time t to time t + lp, making probabilistic decisions in each time jump.
The second step consists on identifying the transition probability p( ij ) of the k th ant that moves from channel i at time t to channel j at time t + 1. It is given by (23) [22]. (23) where (ij) is the set of components that do not belong yet to the partial solution s of the ant k, and α and β are parameters that control the relative importance of the pheromone against heuristic information, η ij = 1 / d ij . d ij is the length between channels i and j at times t and t+1 respectively. The length is determined by the change in channel. If the destination channel where the ant jumps to is the same, then the length is D. If the ant jumps to a different channel, then the length has a value of 10D, as shown in (24). Finally, a greedy selection of the next channel is proposed based on a pseudo-random variable q. q is compared to (0 < q 0 <1) and is a constant change used to establish the relationship between exploitation and exploration. As q 0 decreases, the ant chooses the next channel depending on the levels of pheromones which is the exploitation alternative. As q 0 increases, the ant randomly takes an alternative route which is seen as an adventurous exploration. The exploration tactic also takes place when the ants go beyond local minimums, as shown in (25). (25) Therefore, q 0 is associated with the tolerance risk and values close to 1 suggest a higher understanding of the implied risk. Moreover, the exploration would be appropriate at the beginning of the search and the exploitation would be appropriate at the end [22]. It is noteworthy to mention that q is a random number between 0 and 1. If the ants are instructed to tilt the search, the probability function can be modified for the selection of random numbers.

Max feeding optimization
The second bio-inspired algorithm that is considered is called Max Feeding (MaF) and is based on the amount of energy that an insect requires for its own nutrition. Although the algorithm does not include a random component seen in conventional evolutionary algorithms, it does have an adaptation element for familiar scenarios. The basic idea of this algorithm is to achieve maximum exploitation of the simulation scenario. The objective function (fo) of this model is represented in (26). (26) where  represents the energy used by the kth ant.
The first stage is the transformation of the scene into the pheromone traces within the nest. The nest is composed by all channels for all the times. The traces of pheromones are assigned in proportion to the number of free spaces within the nest. Therefore, as the number of time instants during which a channel is available grows higher, the greater will be the trail of pheromones left from the nest. As a consequence, the insect is led by the attraction of a specific trail. The allocation of weights for pheromones is static at first, due to the counting task of busy and idle channels. Secondly, a dynamic weight allocation takes place in which the pheromones will evaporate as the available timeslots decrease. Initially, the pheromones have high weight values according to the number of available timeslots and said values decay until they reach 0. For the calculation of pheromones, it is essential to have complete knowledge of the studied scenario. Algorithm 1 describes the calculation of the pheromone path matrix. The next stage involves finding the best route based on the pheromone-instilled paths. An agent is created for such labor, which in this case would be an insect in charge of locating the strongest trace of pheromone in channel m at time t 1 . The insect will then move to the channel with the longest pheromone trail and will consider the length of the transmission packet (l p ). The transmitted packet (Pt) and the amount of elapsed time (ti) in the channel will be subtracted. Guided by the pheromone, the insect will transmit the packet until all traces disappear and then intelligently select a new channel once the number of pheromones reaches 1. Therefore, it will have enough time to change from channel m to channel n and guarantee the continuity of the service. Algorithm 2 describes the steps followed by the bug to determine the route to explore.

Algorithm 2: Route exploration by bugs 1
Input: Pheromone_Matrix, Availability_Matrix 2 Output: Times_Array, Channels_Array 3 Initialization Increase handoff in one; Times_Array(i)  max (Pheromone_Matrix(t i , all column)); 8 if Times_Array(i) = 0 then 9 No More routes; The insect will repeat the same process until the current transmission is completed or an equivalent state of finding a new transmission route is entered. Afterwards, the availability matrix is updated with Algorithm 1 and another insect can find a different route (solution). This process is repeated k times, by k insects, to find k routes until there is discontinuity in the transmission. When one of the insects meets its purpose (i.e., transmits a set amount of data), the nest is updated, thus eliminating the route for the food searching task carried out by the kth insect. Since the insect travels the same distance from beginning to end, the only discriminating parameter is the number of jumps made by the insect to avoid interrupting the transmission. Every time that the insect jumps, its energy consumption () increases which means that the insect that consumes the least amount of energy is unequivocally the one that finds the best route.

RESULTS AND DISCUSSION
After implementing the MFHW algorithm, the generated traffic is approached with the same characteristics of the original traffic. Therefore, the first step before moving on to said generation is the estimation of the mean, the Hurst parameter and the width of the multifractal spectrum. The LD, MD and MS are established for each channel in the spectrum and the input values are estimated for the generated traffic. Based on the multifractal series generated for each channel, three availability matrixes were created with 100 times and 100 consecutive channels. In Figure 3, the three predicted availability matrixes are marked in red, the original availability matrix for 100 times from channel 1 up to channel 300 is marked in blue and the failed prediction time is marked in black. The prediction errors do not surpass 34%. Although the percentage error for the availability matrixes is high, this is compensated with accuracies in the prediction of the Hurst parameter of 4.0547% and 8.8583% for the width of the multifractal spectrum. Under the predicted scenarios, it is proposed to use bioinspired algorithms with the purpose of calculating routes thereby minimizing handoff. In a simulation scenario consisting of 461 channels, it is proposed to find routes for a transmission of 70-time instants. For the Max Feeding (MaF) algorithm, the results obtained by the method offer an advantage since it has configuration variables. The algorithm receives the simulation scenario as an input parameter and then finds the best routes thus reducing the number of possible jumps in each iteration. The algorithm boosts performance by dividing the total number of channels to optimize both the search and the jumps carried out in relatively continuous frequencies. Figure 4 shows a fraction of the results. In this example, the first division of the channels delivered 19 routes, from which the minimum handoff was 5, the maximum handoff was 19 and the average handoff for the first division was 8.94. When running the MaF algorithm over all channels, 266 different routes were obtained and 10 of them are shown in Figure 4.
For the execution of the ACO (Ant Colony Optimization) algorithm over the 461 predicted channels, the ants' follow-up parameters must be set. Parameter configuration is based on previous simulations where the effects of varying α, β and qo were considered regarding handoff reduction. The final configuration is stated as: 300 iterations, 3 ants per division, 30 continuous simulation channels, D=2, =60, α=0.05, β=1 and qo=0.05. In Figure 5, the first five ants are visualized with the final path results. This experiment helps in following the path of the first five ants. The division of channels into fragments seeks to eliminate local minimums. 266 routes were found with the MaF algorithm and 46 for the ACO algorithm. It is worth mentioning that the number of routes determined by the ACO method is proportional to the number of ants located in each channel division. Given the number of iterations of the ACO algorithm, the ant optimizes previous routes in each iteration which means that there are n routes at the end of the nth iteration. This means that the number of routes is equal to the number of iterations times the number of ants times the number of channel divisions. In the current scenario, three ants were placed in each channel division, 300 iterations were made, and 461 channels were divided into 15 sections leading to 13500 routes of which 46 routes present the lowest handoffs. To compare the results of the routes computed for all channels, they are displayed in a normalized histogram in Figure 6. The statistics are shown in red for the MaF algorithm and in blue for the Ant Colony method.  . Probability distribution of routes handoff found by EA Figure 6 gathers the percentages of the number of incests for a specific number of handoffs. The ACO algorithm revealed the highest number of routes with a handoff between 10 and 12 which is equivalent to 39% of the calculated routes. In contrast, the MaF algorithm revealed more handoffs between 7 and 8 which is equivalent to 21% of the calculated routes. The routes found by the ACO method can be spread within the 5-17 handoff range and those calculated by the MaF algorithm remain within the 5-40  Figure 7 shows the distribution function of the number of incests for certain bandwidths within the calculated routes.
The SNR (signal-to-noise ratio) is estimated with the help of the power matrix, which gives information on the signal-to-noise ratio for each channel in all the time instants [23][24][25]. The estimation for each route corresponds to the sum of the powers in each step taken by the insect over time divided by the length of the transmission packet which is 70 in this case. The ACO algorithm has the highest number of routes with an SNR between 6925 and 6975 indicating a 20% share. The strongest presence of the SNR for the MaF algorithm lies between 6850 and 6975 with a 40% share and a probability distribution function with kernel bandwidth of 33.98. Figure 8 shows the probability function for the SNR results of both algorithms.

CONCLUSION
The methods proposed in this research involve new mechanisms for the prediction and occupation of the spectrum for secondary users. The application of these methods is focused in the decision-making and mobility processes of the spectrum within cognitive radio. The generation of radio-electric traffic is granted by an algorithm inspired in multifractal traffic. The extension of the MFHW algorithm offered a certainty exceeding 90% in the multifractal behavior of the new incoming traffic. Parameters such as the Hurst exponent, the mean and the width of the multifractal spectrum showed estimation errors below 10% compared to the original values. After the comparative assessment of the bioinspired algorithms discussed hereby, the lowest handoff is delivered by the MaF algorithm while its dispersion of routes is the highest with a variable handoff between 5 and 40 for a transmission of 70 time units. The ACO strategy has lower variability in the number of handoffs within the 4-18 range for the same transmission length. The efficiency of the Max Feeding method lies in harnessing the previous knowledge on the environment which guarantees lower execution times and higher robustness in the task of searching routes. In unknown environments where the algorithm should be able to adapt to it, the ant colony strategy is suggested since its inherent resettable partial randomness leads to the same efficiency under scenarios with unknown behavior. For cognitive radio networks, two components are sought in the discussed algorithms: robustness and adaptability. In conclusion, a hybrid solution between these two alternatives can deliver more suitable results for the routing process of primary users in cognitive radio networks.