Fingerprint positioning of users devices in long term evolution cellular network using K nearest neighbour algorithm

Received Apr 25, 2020 Revised Jul 8, 2020 Accepted Aug 16, 2020 The rapid exponential growth in wireless technologies and the need for public safety has led to increasing demand for location-based services. Terrestrial cellular networks can offer acceptable position estimation for users that can meet the statutory requirements set by the Federal Communications Commission in case of network-based positioning, for safety regulations. In this study, the proposed radio frequency pattern matching (RFPM) method is implemented and tested to determine a user’s location effectively. The RFPM method has been tested and validated in two different environment. The evaluations show remarkable results especially in the Micro cell scenario, at 67% of positioning error 15m and at 90% 31.78m for Micro cell scenario, with results of 75.66m at 67% and 141.4m at 90% for Macro cell scenario.


INTRODUCTION
The positioning has been considered an optional feature in the standardization, implementation, and exploitation of existent cellular networks. Nevertheless, the big cellular communication infrastructure distributed around the world can still be reused for localization purposes, contributing an added advantage to network management and services [1]. In principal, location based services have been driven by two main demands: commercial purposes and emergency services. Commercial users require accurate and responsive location performance, for purposes such as location based advertising and maps. In emergency services, the most significant sponsor is the FCC's E911 authorization in the USA, which demands the location of emergency callers to be delivered within certified accuracy limits. In order to meet FCC requirements, several techniques have been improved for the purposes of positioning [1,2].
The FCC requirements of positioning accuracies are 50m (67%) and 150m (90%) in case of terminal-based systems, and 100m (67%) and 300m (90%) in case of network-based systems [1,2]. Whilst satellite technology has demonstrated it can meet these requirements, delivering very good location estimation in open environments. In urban and indoor environments, however, the performance of positioning can be very poor due to blocking of satellite signals by buildings as well as multipath propagation. Often, positioning with satellite systems in these environments can be impossible [3,4]. Due to the requirements for public safety, positioning technology itself is one of the vital matters to be tackled. Several systems for location determination in LTE networks have been improved to verify a user's position under specified environments. This underlines need and includes searching for the best available technique that could be applied to determine the location of a user.
With the emergence of LTE, new efforts appear to focus on empowering E911 and LBS on these 4G systems, through the offering of a seamless conversion between 2G/3G and LTE positioning services. Cellular networks can provide an acceptable positioning estimation that satisfies the FCC requirements. Especially, with the emergence of the 3rd generation partnership project (3GPP) and long term evolution (LTE), these can achieve good accuracy providing excellent coverage, particularly in urban and indoor environments [4].
Many cellular network positioning techniques taken into consideration in Release 9 of (3GPP) document such as observed time difference of arrival (OTDOA) and enhanced cell ID (E-CID). Based on the literature such as [5,6], among others, the highest accuracy can be provided by global positioning system (GPS) especially, in line of sight and free space environments. However, the performance of GPS is significantly reduced in dense urban and indoor environments.
Cell identification (CID) positioning, is considered as a network based method can also be employed to estimate the location of mobile station, however, with very limited accuracy. The easiest case example is where the position of the mobile station is estimated to be the location of the base station [7]. In cell ID, positioning performance can be enhanced by measuring specific network characteristics, a technique known as enhanced cell ID (ECID). A cellular network can utilise the angle of arrival of a signal from the mobile station to deliver directional information, the researcher in [8] stated that the accuracy is improved especially in LTE rural areas. The drawback with this method is the requirement for extra hardware and hence it can be expensive. The authors in [9] declared in their experiments that the accuracy of Cell-ID is not satisfactory as a general solution.
In addition, the round trip time (RTT) can be used to estimate the distance between the mobile station and base station as argued in [10], however, due to path loss and shadow fading effect, the accuracy of this method can be limited. Another positioning method can be used in LTE system which are AoA and observed time difference of arrival (OTDOA) techniques. The AoA determines the direction of radio frequency wave propagation and involve an antenna array at the side of the incoming wave (network side). This technique works particularly well under conditions of the line-of-sight (LoS) [11]. Literature shows that there are many efforts have been achieved on OTDOA positioning technology in LTE system. For TDoA approaches, time differences are measured at the receiving base stations between the times of flight of different radio signals; this is utilized in, e.g., LoRa [12,13].
Amplitude-based techniques convert the obtained signal intensity to a distance based on a path loss (PL) model for distance conversion; however, the considered setting demands that an effective PL model be accounted for. Knowing the network topology, i.e., predicting the distance between a mobile user and a set of base stations reduces the location of all signal parameters to a triangulation or multilateration problem [14].
A novel OTDOA positioning scheme in heterogeneous LTE-advanced systems is carried out. This scheme avoids interference which can greatly enhance the positioning accuracy [15]. In [16] the study is suggested a positioning technique dependent on relative time difference of arrival (RTDOA) where measurements were employed in a femto cells cluster. However, the needs for synchronization of time and at least three base stations to be available are the main drawbacks of this technique, although it provides good accuracy.
The Fingerprinting positioning technique is currently a very active topic that does not require additional hardware or software in-network or mobile station. This method is discussed in the 3GPP meeting under the name radio frequency pattern matching (RFPM) to be deployed in LTE release 12 [17]. In [18,19], the authors were discussed in their study the principle of fingerprinting positioning method in LTE using CRLB and maximum likelihood algorithms, but their research was carried out in an urban environment.
The proposed method in this paper highlights fingerprinting technique and will be implemented to find user's position in LTE networks based on the theoretical approach. Firstly, our proposed method will be performed using program-based-computer. The code of the suggested method will be written and evaluated in MATLAB environment according to the KNN algorithm based on mathematical equations.
Secondly, data collected from our suggested method is based on path loss propagation and shadow fading effects on the signal transmitted. Finally, the proposed method will be tested in two different scenarios: Macro cell and Micro cell environments, under different conditions. This article is arranged as follows: the motivation and the related works were described in Section 1, while the system mathematical model was driven in Section 2, next, the methodology is clarified in Sections 3, the results of the simulation are described and discussed in Section 4, and lastly, conclusions are specified in Section 5.

System configuration
Creating the fingerprints database by conventional method are relatively simple process. The received signal (RS) measurements are achieved by the receiver at each reference point, and after certain processing the data is stored in the database. Generally, the more reference points that are selected in the collected or training phase, the better estimation accuracy can be accomplished in the positioning phase because of fingerprints database can be acquired more accurately [20].
Computing the location of the mobile station during the positioning phase requires an algorithm. There are several algorithms in order to estimate the user; in this work the K nearest neighbour (KNN) algorithm will be used. [21] in their investigation stated that the distance between the measured RSS vector [s 1 s 2 s n ] and the RSS vector in the database [S 1 S 2 S n ] can be computed. The generalized distance between two vectors can be calculated by the (1): where q=1 is referred to Manhattan distance and Euclidean distance when q=2.
The nearest neighbour algorithm can be defined as the point with the shortest distance the signal can take [20]. In the K Nearest Neighbours, the averages of K points coordinate can be utilize to estimate the location of mobile station. This algorithm provides better estimation than Nearest Neighbours algorithm because there is no reason to select only the nearest value and ignore other nearby values. Other algorithms such as the smallest polygon [22] and neural networks [23] are either complicated or do not necessarily improve the accuracy of the location estimation.
In order to collect database, propagation models can be utilized to generate artificial fingerprints for the RF Pattern Matching method. Based on certain propagation model the measurements can be calculated to collect the fingerprints database. Generally, these fingerprints that generated artificially can be called synthetic fingerprints [24]. The set of received signal (RS) map for the area under study in this work will be based on COST 231 Wallfisch-Ikegami model. Wallfisch-Ikegami model includes some deterministic aspects to the empirical models; thus, to characterize the propagation environment more accurately [24].

Radio frequency pattern matching
RF Pattern Matching or fingerprinting is currently a very active topic in positioning, since it can be performed effectively in the cellular network where it is discussed in the meeting of 3GPP under the name of radio frequency pattern matching (RFPM) [17]. RFPM consists of two phases: training phase and positioning phase. In the training phase, sometimes referred to as the off-line phase, the aim is to create a fingerprint database. This requires a reference point (RP) which first must be chosen carefully. In general, the obtained data represent the received signal measured by the mobile station. Locating a mobile station at a reference point requires that the received signal of all base stations must be measured. The characteristic features of such measurements of the reference point will be evaluated and then recorded in the database. This iterative process will be performed at another reference point, and so on until all reference points are visited [25,26].
In the positioning phase or on-line phase, when the mobile station position is required, the measurements of each received signal will take place. Then, the measurements will compare with data in the server database according to the appropriate matching algorithm. The outcome of this process is likely to be the location of the mobile station [25,26]. The Figure 1 illustrates the principle of the two phases.
Channel obstacles such as interference and noise are changed unpredictably over time because of user movement. Predicting the relationship between received signals and the distance in wireless networks is determined by path loss and signal fading. Path loss can be seen as an analytical model that can estimate the received signal when a clear line of sight path is available between receiver and transmitter. Path loss is based on carrier frequency and distance between both transmitter and receiver. According to Friis free space equation, the received power can be obtained from the (2) [27]: where: Pr represents the received power, Pt is the transmission power. Gt and Gr are the antenna gain of transmitter and receiver. is the wavelength of the signal, c is the light speed, f is the carrier frequency and d represents the distance between transmitter and receiver. We can also write the equation in dB unit as following: where L fs can be given by: = 32.45 + 20 log 10 ( ) + 20 log 10 ( ) Based on a combination of Walfisch and Ikegami-Bertoni model and is additionally developed by COST 231 project. The model is built on numerous analysis and site tests and it is suitable for flat suburban and urban areas that have coordinated heights of building, densely located buildings and large population areas. The formula of calculating the Walfisch-Ikegami model can be given by: = 42.64 + 26 log 10 ( ) + 20 log 10 ( ) The model is applicable for carrier frequency 800MHz≤fc≤2000MHz, height of eNodeB 4m≤ hb ≤ 50m, the height of UE 1m≤hm≤3m and coverage area 20m≤R≤5km [28]. The predicted received signal between transmitter and receiver based on shadowing bias (ψ) can be calculated by [29]: = + + -10 log 10 ( ) + (7) Figure 1. Two phases of RFPM positioning

METHODOLOGY
In this subsection, the testing environment for estimating mobile station will be based on MATLAB platform. Several scenarios will be examined under different environmental conditions in order to obtain acceptable results of mobile positioning. The acquired results will be based on path loss propagation and log-normal shadow fading effect. The statistical results of the scenarios will be computed the original position of mobile user (x , y) with the estimated position (`x, `y) and the measurements error which can be defined in the (8) where E represents error in measurements and i denotes quantities related to the ith measurements. By using Microsoft Excel we can obtain these measurements. Another statistic will be computed which represents angular error in estimation measurements. Error in angles can be computed according to the (9): where (x b , y b ) represents the location of serving eNB, (x , y) denotes to the location of UE. The cumulative density function (CDF) will be computed in order to display the percentile values of 67%, 90%, 95% and 99% to see to what extent our approach can satisfy FCC requirement.

Macro cell environment
This scenario represents the environment of macro 7 cells, one-tier system around the serving eNodeB, which is in the centre. We also assume that each cell has a hexagonal grid. Inter-site-distance is assumed to be 1000 meter with a transmission power of 46dBm and antenna gain 18dBi for base station [29]. The assumptions of the parameters are based on 3GPP LTE standards [30]. Table 1 provides simulation parameters in details. The simulation estimates the location of mobile station according to K nearest neighbour (KNN) algorithm as discussed its principle previously. When the measured RSS from UE is collected the simulation will compute the difference between measured RSS and each observed RSS stored in database. If the matching is occurred, the program will return the reference point (x, y) to that value thereby, the estimation of UE is occupied. In case of no matching, occurred which is more practically, the simulation according to the used algorithm compares the mean of the nearest three neighbours coordinates to find the best matching of RSS. Then, return the reference point that represents the location estimating of UE.

Micro cell environment
In Micro cell scenario we also will consider the same cells site in the previous scenario. However, the coverage area of the cells will be less than macro cell to provide another test approach for fingerprinting method. The parameters of the simulation and cell layout will assume to have slow fading or shadowing effect. The assumptions of the parameters are based on 3GPP LTE standards [30]. Table 2 illustrates the parameters of the proposed scenario.
In this scenario, the transmission power of the eNodeB is set to 40dBm since the micro cell requires less power to avoid interference. Antenna gain of the transmitter is assumed to be 15dBi and inter-site-distance (ISD) is 250 meter with carrier frequency of 2000MHz. The shadow fading is log-normal distribution. Again the simulation will estimate the location of mobile station according to K nearest neighbour (KNN) algorithm in the same procedure that explained in macro cell scenario.

RESULTS AND DISCUSSIONS
The acquired results showed that in macro cell scenario, mean value of the estimation error is 65.54m and standard deviation is 43.73m. The mean and standard deviation values give indication that estimation error values in this scenario are acceptable and can achieve FCC conditions. Figure 2 is shown that when the distance error is increased the probability of getting error is also increased. This means that mobile user position locates near from the base station can be spoor more accurately. After a specific value the CDF gives constant values and this is because at that point the error becomes at maximum value. As we can see in this scenario the positioning error at 67% is definitely satisfied the FCC requirement of positioning accuracy as well as 90%, 95% and 99% are also achieved the requirement of FCC. Turning to micro cell Scenario, and according to the FCC requirement for positioning accuracy our approach gives better results than previous scenario. Where mean or average value of the estimation error is 16.75m and standard deviation is 11.15m as shown in Figure 3.  Comparing our results to the literature such as [18] stated in their research that their results for urban area scenario were at 67% of positioning error was 38.4m and at 95% and 99% were 85m and 161.4m respectively. Whereas, the researchers in [19] declared in their investigation in CRLB on RF Pattern Matching method in LTE system for urban area scenario the results were at 67% equals 49.6m and at 95% was 93.4m. Therefore, according to the literature, we can deduce that our results in micro cell environment effectively improved the accuracy of UE positioning.
In general, based on the results obtained using KNN algorithm, the simulation shows that micro cell scenario can provide better accuracy for positioning than macro cell scenario. By studying the behaviour of cumulative function as illustrated in Figure 4, we can see that when the distance between mobile user and serving base station decrease, the probability of getting error will also decrease; thus, we get better accuracy.

CONCLUSION
In this work, radio frequency pattern matching (RFPM) method has been proposed to determine a user's position in LTE systems using KNN algorithm. Our study adopted a theoretical approach to implement the method based on a computer program. Depending on path loss propagation and shadow fading effects on the signal, the database of the suggested method was collected using a program-based-computer. Our proposed method was tested and evaluated into two scenarios under different conditions. The obtained results from the simulation showed that the accuracy was improved in both scenarios and met the FCC requirements. The study has demonstrated that the RFPM method has shown very good performance in reducing positioning error in both scenarios, 15m at 67% and 31.78m at 90% of positioning error for Micro cell scenario and 75.66m (67%) and 141.4m (90%) for Macro cell scenario.