Error bounds for wireless localization in NLOS environments

An efficient and accurate method to evaluate the fundamental error bounds for wireless sensor localization is proposed. While there already exist efficient tools like CramèrRao lower bound (CRLB) and position error bound (PEB) to estimate error limits, in their standard formulation they all need an accurate knowledge of the statistic of the ranging error. This requirement, under Non-Line-of-Sight (NLOS) environments, is impossible to be met a priori. Therefore, it is shown that collecting a small number of samples from each link and applying them to a non-parametric estimator, like the gaussian kernel (GK), could lead to a quite accurate reconstruction of the error distribution. A proposed Edgeworth Expansion method is employed to reconstruct the error statistic in a much more efficient way with respect to the GK. It is shown that with this method, it is possible to get fundamental error bounds almost as accurate as the theoretical case, i.e. when a priori knowledge of the error distribution is available. Therein, a technique to determine fundamental error limits–CRLB and PEB–onsite without knowledge of the statistics of the ranging errors is proposed.


INTRODUCTION
Recently, wireless sensor localization have been widely used for positioning and navigation with various applications in health, transport, environment and other commercial services [1,2,3,4]. As we know, WSNs comprises numerous of wirelessly connected sensors, as a result sensor positioning has become an important problem. The global positioning system (GPS) currently available is expensive, and therein relatively few sensors are equipped with GPS receivers called reference devices, whereas the other sensors are blindfolded devices (nodes). Several methods have been proposed to estimate the positions of sensor nodes in WSN, a problem known as Node Localization [5,6].
Inherently, obtaining the lower bound on location errors in relation to every node is an essential and basic problem within the positioning context of WSN. As a result, the most commonly used tool is the Cramèr-Rao lower bound [7,8,9,10], describing the average mean square error (i.e. the distance between the true and estimated node location). Also, it establishes the minimum root mean square error theoretically achievable with an unbiased estimator and it is commonly used as a designing tool, in the sense that it offers a bench mark against which estimation algorithms can be compared with. Another popular tool is the position error bound [11,12,13] which illustrates the confidence region where a node should be located with a certain confidence interval. It is important to note that both the CRLB and the PEB are obtained from the fisher information matrix. Since they both rely on the knowledge of the distribution of the ranging error, which in turn depends on environmental and technological factors, obtaining their formulation a priori is almost impossible, especially in WSNs affected by mainly Non-Line-of-Sight (NLOS).
Journal homepage: http://ijece.iaescore.com/index.php/IJECE Ì ISSN: 2088-8708 Mainly, there exists two methods for evaluating the distribution of ranging error measurementsthe parametric method which are used for specific and explicit distributions such as Gaussian, Exponential, Rayleigh etc. and non-parametric method are used for all other distributions without explicit expression. The feasible solution is to approximate the distribution statistics of the ranging errors on-site, by collecting ranging samples from each target-anchor link and then estimating the lower bound on the location errors even before target localiztion. One immediate application of on-site estimation of error statistics is that this can be used to inform cooperative localization algorithms on which nodes to cooperate with to reduce the commulative localization error for any target.
To this end, the well known maximum likelihood parametric approach is going to fail, given that in general there is no a priori knowledge on the error distribution. A truly non-parametric approach is therefore required in this case; in particular the kernel method is very appreciated for its capability to reconstruct empirical distributions from samples, and in particular its Gaussian kernel (GK) realization. Numerous works have been done on error analyses for wireless localization with most efforts based on Line-of-Sight conditions [14,15,16], which lead to severe degradations as NLOS conditions are more appropriate for an accurate wireless localization. Various localization algorithms and performance analyses for NLOS environment have been proposed [15,16,17,18]. The parametric exponential distribution-based CRLB model in [15] can not be used for other parametric distributions to simulate NLOS ranging errors. The CRLB in [16] was derived for NLOS environment using on a single reflection model, and can not be used in a situation where most signals arrive at the receiver after multi-reflections. The CRLB with or without NLOS statistics was derived for NLOS situation in [17]. For the case without NLOS statistics, the authors computed the CRLB in a mixed NLOS/LOS environment and proved that the CRLB for a mixed NLOS/ LOS environment depends only on LOS signals, while for the case with NLOS statistics, the authors only provided a definition of CRLB.
In this article, the GK method utilised to obtain the on-site the statistic of the ranging errors is reproduced and both the CRLB and PEB are then rewritten, along with their performance analysis in various forms. Compared with the previous performance studies for LOS and NLOS conditions, the contributions of this article are as follows: a. A mathematical description of the system model and standard error bounds are formulated, which depicted that the ranging model and bounds derived are applicable to any distribution of ranging errors. For easy modelling of NLOS conditions, the nakagami distribution model was used Section 2. b. A Gaussian kernel (GK) method was introduced and a mathematical formulation of its lower bounds were obtained to derive the statistical distribution of the errors similar to [18] Section 3. Also, a newly proposed Edgeworth expansion (EE) method was introduced and a mathematical formulation of its lower bounds were obtained to derive the statistical distribution of the errors Section 4. c. A thorough and complete analyses of CRLBs and PEBs for the GK and EE methods, which upholdss the proposed EE method by exhibiting that it indeed comes very close in achieving the fundamental lower bound in terms of location error. Its greater efficiency is further proved by the much lower number of samples needed to reach the same level of accuracy as the GK technique Section 5.

SYSTEM MODEL AND FORMULATION 2.1. System model
Consider a network of N nodes in an η-dimensional Euclidean space, out of which blindfolded devices indexed 1, · · · , N t have no knowledge of their location (henceforth targets), while devices indexed N t + 1, · · · , N t + N a are anchors, i.e. reference devices of a priori known location. For the sake of clarity, we shall hereafter scrutinize the case of when η = 2, with the remark that the analysis to follow can be straightforwardly extended to η > 2.
The localization problem consists of estimating the location of target nodes, given the knowledge on the location of anchor nodes, and a set of measures of distances amongst devices typically affected by errors [8]. To elaborate, let the position of the i-th device be denoted by (x i , y i ), such that the coordinate vector of the target to be approximated is described as It is well known that when two nodes are able to exchange information, they are able to estimate the mutual distances between themselves, a process referred to as ranging. Consistently, ranging measurements are always affected by noise and often they are not obtained over a LOS link between nodes. In NLOS scenarios, an additional ranging error referred to as bias in the form of a positive deviation from the true mutual distance appears. Under these assumptions, the ranging model applicable to a pair of devices i-th and j-th is given bỹ whered ij is the measured distance, d ij is the true distance, n ij is an additive white Gaussian noise with mean µ = 0 and variance σ 2 ij , b ij is the bias, and the residual noise v ij where the noise and bias are modelled jointly.

Standard error bound formulations
Here, the fisher information matrix (FIM) J [9] as the fundamental matrix to obtain both the CRLB and PEB are clearly formulated, with the aim of clearly introducing the notations and methods to be employed in the Sections 3. and 4. where the gaussian kernel (GK) [18] and edgeworth expansion (EE) (proposed) [19] error bounds will be formulated and discussed.
Letd be the range measurements (measured distances) vector denoted as Letθ be an estimate of the vector parameter θ and E[θ] as the expected value ofθ. The CRLB matrix relates to the Fisher information matrix J [9] as The Fisher information matrix J is accordingly given as The log of the joint conditional probability density function (PDF) is where (7) and in (6), the FIM is then denoted by [14] J J xx J xy J xy J yy (8) where and k, l = 1 . . . n are the blindfolded (target) nodes. J xx , J yy , J xy , and J are of sizes n × n and 2n × 2n, respectively.

Modeling range measurements
The statistics of the measured distances between nodes-adopting the most recognised propagation models in mobile and wireless communication in the literature [21,22], has been modeled after the nakagami distribution (ND). The nakagami distribution was selected to fit empirical data and is known to provide a closer match to most measurement data than either the Gaussian, Rayleigh or Rician distributions. Beyond its empirical justification, the nakagami distribution is often used for the following reasons. First, the nakagami distribution can model environmental conditions that are either more or less severe than Rayleigh fading. When the nakagami shape factor is 1, the nakagami distribution becomes the Rayleigh distribution, and when the nakagami shape factor is 1/2, it becomes a one-sided Gaussian distribution. Second, the Rice distribution can be closely approximated using the close form relationship between the Rice factor and the nakagami shape factor. Due to the empirical data and work done in [21], the nakagami distribution was chosen to model the NLOS conditions for ranging measurements.
The PDF of the residual noise v ij , to evaluate the performance of both the gaussian kernel and edgeworth expansion methods, will therein be where m ij and Ω ij are the shape and controlling spread parameters of the Nakagami distribution.

Bounds derivation using nakagami distributions
Given the obtained ranging model's PDF, it is now attainable to derive a new formula for the FIM.
From (9), take its natural logarithm and substitute the result into ∂l kl ∂x k , ∂l kl ∂y k , ∂l kl ∂x l and ∂l kl ∂y l yields and therefore ISSN: 2088-8708 Ì 5539

ERROR ESTIMATION VIA GAUSSIAN KERNEL
In [18], using the gaussian kernel (GK) method the error bound formulation was obtained to model the PDF of the positive deviation -bias b ij . This step was taken to ensure that the white Gaussian noise n ij and positive bias b ij in the ranging errors were to be modeled independently. In this article, the work in [18] was well modified and improved upon where the residual noise was modeled independently so as to enable both the noise and bias to be modeled jointly as the residual noise v ij , as it is well known in the literature the impossibity of seperating LOS noise from NLOS bias in a wireless environment.

Error distribution reconstruction
The PDF of the residual noise v ij is obtained from samples of ranging measurements. This is an estimation of the true distribution by building a sum of kernels (which are derived from an exponentially decaying function) of the collected ranging samples, whose efficiency and accuracy depends on the total number of collected samples P . Between the i-th and j-th nodes, Sv ijq is defined as the q-th sample over the link, the non-parametric Gaussian Kernel technique estimates the PDF of the residual noise as where exp(−) is the Gaussian kernel exponential function and the smoothing constant h ij is the width of this Gaussian kernel function given as 1.06σ s P −1/5 (σ s is the sample standard deviation of the residual noise).

Bounds derivation using gaussian kernel
Following the same approach as in Subsection 2.4., from (12), the natural logarithm can be substituted into ∂l kl ∂x k , ∂l kl ∂y k , ∂l kl ∂x l and ∂l kl ∂y l obtaining where and the elements of the Fisher Information are similar with (11) except for the coefficient: (15) where kl ∈ LOS and kl ∈ NLOS represent propagation conditions between nodes k and l.

ERROR ESTIMATION VIA EDGEWORTH EXPANSION
The ranging error approximation technique presented in the previous section, though robust, is constrained by the enormous amount of samples required to obtain a fair accuracy of the approximates of the distribution of a given set of samples. In the following, we introduce a more efficient and general method, based on Edgeworth expansion, with two main advantages: a much smaller number of samples are required for approximation and the possibility to model both the additive Gaussian noise and the positive bias jointly. While the prospect of reducing the number of samples required to obtain a fair accuracy can not be overemphasized, it is essential to state that, in wireless channels, the positive bias and Gaussian ranging errors cannot Ì ISSN: 2088-8708 be separated from each other. Therein, we describe the process of reconstructing the ranging error distribution from samples, and then the convergence and monotonicty of moments from samples is shown, thereby proving a clear improvement in accuracy with respect to the Gaussian kernel technique, and finally the proposed formulation of PEB and CRLB are shown.

Error distribution reconstruction
The Edgeworth expansion which is an improved version on the central limit theorem (CLT) is a true asymptotic expansion of the PDF of a gaussian variablex = (x − µ)/σ in the powers of the mean µ. EE is a formal series of functions that has the characteristics of truncating a series after a finite number of terms, which is sufficient enough to provide an accurate estimation to this function, therein the estimation error is monitored [19].
The EE as a non-parametric approximator can be used for estimating the PDF of given ranging errors from their sample moments α w [19]. The EE is given as where, N (µ, σ 2 ) is the PDF of a normal distribution with mean µ and variance σ 2 , S w+2 = κ w+2 κ w+1 2 , κ w are the cumulants obtained from the sample moments α w as The set {k w } consists of all non-negative (positive and zero) integer solutions of the Diophantine set of equations s = k 1 + 2k 2 + · · · + sk s and r = k 1 + k 2 + · · · + k s . The Chebyshev-Hermite polynomial He n (x) is and the mean and variance of the ranging errors are µ = α 1 and σ 2 = κ 2 , respectively [20,21,22].
The sample moments from the ranging errors are α w = 1/n n i=1 X i w , where X i are the ranging errors and w = 1, 2, 3 . . . are the orders of the moment. To determine the number of orders of moment α w required to estimate a given sample, the standard error s of the samples is calculated using σ 2 s / √ P , where s must be ≤ 0.3, for each order w. The Edgeworth Expansion is used to model the residual noise v ij , hence, the estimated PDF of the residual noise f vij (v ij ) is

Efficiency and convergence of sample moments of the EE method
To illustrate the effectiveness and efficiency of the Edgeworth Method, it is mandartory to demosntrate the convergence of its sample moments as the number of samples P increases [23,24], and therein compare it with the Gaussian Kernel. Using the Nakagami Distributed random variables as seen in Figure 1, the true moments γ w of a ND (m = 1, Ω = 1) is compared with sample moments α w [23] for different number of samples and moment orders w = 1, . . . , 4. The deviationê w of the sample moments from the true moment is obtained as the absolute of (α w − γ w )/γ w . Also, the monotonicity of the sample moments is seen in Figure 2. The Kulback Leiber Divergence of the PDFs obtained using the two methods (GK and EE) from the true/theoretical Nakagami PDF is shown in Figure 3. For KLD=0.01, the proposed EE required less than 300 samples, while the GK needs approximately 500 samples to obtain similar results. This delta increases even more for lower level of divergency: to reach a KLD=0.0075, the GK method needs approximately twice the number of samples than the EE method. As a result, the Edgeworth method is a good choice for approximation the distribution of ranging errors from samples.

Bounds derivation using edgeworth expansion
From the formulation of the approximated PDF of the residual error in (19), the creation of the Fisher information is the same with the GK: This yields where Ì ISSN: 2088-8708 The elements of the Fisher Information are similar with the GK except for the coefficient:

PERFORMANCE EVALUATION
Moving forward from the various theoretical analyses presented in this paper, we can state that the EE methods can approximate the statistics of ranging errors (using Nakagami Distribution), with lesser samples and more accuracy with respect to the GK as such we now consider real network topologies to further illustrate the performances of both methods. Therefore, a region of 10m × 10m is employed, where three (a n = 3) anchors are placed to form a triangular shape and three (n = 3) blindfolded devices (targets), not connected together, are randomly placed within the convex of the anchors. The two error bounds -the CRLB and the PEB -will be ultilized to evaluate the performance of the two estimators -EE and GK. The average CRLB for any network topology can be computed usingε while the PEB can be illustrated by the 95% Confidence Interval C i = 0.95, whose mathematical formulation is shown. The Fisher Ellipse parameters of the i-th target θ i are estimated from the covariance matrix Ω θi , which is a combination of the error variance σ 2 i:x and σ 2 i:y on the "x" and "y" dimensions, respectively and the cross-term σ i:xy , given as The directions of the scattering in the space for the vector θ i are known to be directly proportional to the eigenvalues associated to Ω θi up to a factor of κ i [25,11,12]. In particular, the axis direction of the ellipse which describes this scattering in the space is 2 √ κ i λ i:1 , 2 √ κ i λ i:2 , where λ i:1 1 2 σ 2 i:x + σ 2 i:y + (σ 2 i:x − σ 2 i:y ) 2 + 4σ 2 i:xy , λ i:2 If σ i:y > σ i:x , then in (25) the orders of λ i:1 and λ i:2 are swapped. The proportionality factor κ i can be related to the confidence interval C i in that the target θ i is enclosed in an ellipse, as such It follows that the Fisher Ellipse for the i-th target θ i is described through the following in [25] [(x − p i:x ) cos γ i + (y − p i:y ) sin where the rotation angle γ i describes the offset between the principal axis for the ellipse and reference axis and it is defined as γ i  To compare both estimators, a line representing the theoretical CRLB, i.e. computed with perfect knowledge of the statistic of the propagation channel has been added to the plots. Clearly the CRLB of reconstructed EE is much closer to the theoretical one than the GK: the EE performs better for any samples. From the above derived CRLBs, the minimum number of samples P required for obtaining accurate results are analyzed. As shown in Figure 3, it is seen that the non-parametric estimators converges to the true PDF with sufficient sample size, therefore the estimated CRLBs converges quickly to a stable value as P increases.
Furthermore, the two estimators are now represented by their respective Fisher ellipses (theoretical and reconstructed from the two methods) for P = 50 samples as seen in Figure 5. Clearly, the samples reconstructed with the EE estimator almost perfectly match the theoretical one, where they vary only in the axis orientation. As the sample number increases to P = 250 samples, the Fisher ellipses of the two estimators have almost or matching axis orientation to the theoretical PEB with the EE method much closer than the GK method. To clearly and better capture the differences between the two estimators with respect to the theoretical PEB, Figure 6 depicts, as a function of the number of samples, the inner product PEB ∆ where · denotes the inner product, A is the area of the theoretical Fisher ellipse andÂ is the area of the reconstructed methods (EE or GK method). From the discussion in this section, it can be clearly seen that Edgeworth Expansion method performs far better than the Gaussian Kernel method, which therein implies that theapproximated Fisher Ellipses of Edgeworth Expansion are much closer in size and orientation to the theoretical Fisher Ellipses.

CONCLUSIONS
This article clealry focuses on the error analyses of approximating and reconstructing the statistics of the ranging measurements without a priori knowledge of the wireless channel. A popular and non-paremetric estimator, the Gaussian kernel was first decribed and utilized for the approximation and reconstruction of the error distributions from samples and the corresponding error bound was derived. Futhermore, an Edgeworth Expansion method was therein, to reconstruct the error distribution statistics from samples of the ranging measurements by exploiting the efficiency and effectiveness of moment convergence. This approach was clearly shown and proven to be valid for Non-Line-of-Sight conditions, where it is impossible to estimate the statistics of the ranging errors a prior. Results and figures illustrated showed that the Edgeworth expansion technique is a far more efficient and accurate technique than the Gaussian kernel method, requiring lesser sample size to reach the samilar level accuracy.