Amateur radio sensing technique using a combination of energy detection and waveform classification

ABSTRACT


INTRODUCTION
The radio spectrum remains the radio frequency (RF) part of the electromagnetic spectrum, which is considered a limited source.With the advancement of communication technology, government agencies must supervise the management of the frequency band following rules to avoid mutual interference.Therefore, monitoring spectrum usage and recording usage statistics are essential for the development, improvement and issuance of regulations under actual use conditions, particularly regarding the available frequencies of public amateur radio.The technology that can be used to support this activity is cognitive radio (CR), which has been used extensively in solving the problem of frequency density, as demonstrated in [1], [2].
Due to the increasing demand for radio frequency communication, it is very challenging to exploit these limited or underutilized spectral resources by using CR technology, as presented by [3].One of the essential elements of CR theory is the ability to measure, understand, determine and be informed of the parameters related to radio channel properties, as shown by [4], [5].The main features of CR are spectrum sensing, spectrum decision, and spectrum sharing and spectrum mobility, as shown by [6], [7].Spectrum  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 12, No. 1, February 2022: 399-410 400 sensing is the responsibility to obtain knowledge about the spectrum usage and presence of users in a geographical area.As demonstrated by [8], [9], the basic spectrum sensing techniques are energy detection (ED), matched filter detection, cyclostationary detection, and certain other detection techniques, each of which has operational specifications, benefits and limitations.ED is a successful and uncomplicated technique that is particularly suited to a random signal, and it will be considered in this paper.
ED is one of the simplest methods of detection technology because the CR receiver does not require any information about the samples received previously.Notably, its purpose is to process the received samples to estimate the energy level in the channel.As demonstrated by [10], the authors proposed a method to use ED after optimally combining the signal samples received in space and time based on the principle of maximizing the signal-to-noise ratio (SNR).The determination of the threshold is the critical parameter in the classical energy detector.It must be optimized for each detection technique to improve its performance, as demonstrated by [11]- [13].In a wide-band spectrum sensing scenario, a subband ED method can perform effectively under noise uncertainty and frequency-selective channels and the implementation of filter bank spectrum sensing, as shown by [14], [15], respectively.However, the fundamental principle of ED is to compare the signal energy to a sensing threshold in a given bandwidth within a specific sensing period, as demonstrated by [16].
Many researchers have focused on simulating and making real-time measurements for a wide range of environments and conditions.Koley et al. [17], Varma and Mitra in [18] used NI-USRP, which interfaced with a system through LabVIEW software to act as an RF transceiver.A wireless open-access research platform (WARP) board was implemented in real-time ED, as demonstrated by [19], [20].Moreover, the RFeye sensing node was used to record signals for radio spectrum monitoring purposes, as shown by [21].Another interesting issue, as presented by [22], is the case in which the transmitter switches from active to interactive at random time intervals.This paper uses a ZedBoard combined with the analog devices AD-FMCOMMS3 module as the CR receiver in the experimental setup.The modules are controlled and processed with a program developed in MATLAB.
It is now widely accepted that artificial intelligence technology performs essential functions in every field; for example, there is a machine learning approach to ranging error migration for localization algorithms, as shown by [23].Numerous machine learning techniques, including both supervised and unsupervised machine learning algorithms, have even been used and applied in spectrum sensing applications, as demonstrated by [24]- [27].In addition, detection and classification based on waveform characteristics have been investigated in numerous areas, such as seismic signals, electrocardiogram signals and multiplexing signals, as shown by [28]- [30].The combination of machine learning performance and wave character analysis can be used to design novel models that can operate more efficiently for spectrum sensing purposes.
In actual use, a particular frequency spectrum has diverse characteristics and applications.The Office of National Broadcasting and Telecommunications Commission, Thailand, has determined the control of the frequency band in the National Table of Frequency Allocation, as shown by [31], by specifying the use of the frequency range 134-174 MHz for amateur public radio.The number of amateur radio users in Thailand is continuously increasing.However, there is still a lack of statistics on usage, including the disturbance of the frequency spectrum in the amateur radio band, which is very important for the agencies responsible for governing the allocation of spectrum resources.
Motivated by the above challenges, this paper proposes an energy detection and waveform feature classification (EDWC) algorithm for amateur public radio based on ED techniques and waveform characteristics that use machine learning algorithms.The only prior information required is the bandwidth of each channel $B$.The proposed EDWC algorithm consists of two processes: ED and waveform classification.The waveform classification process includes two steps: i) the training phase and ii) the identification of clusters as sound or noise signals.To the best of the author's knowledge, detection and machine learning techniques have not been adopted for spectrum sensing in the amateur frequency band in the existing literature.The main contributions of this paper are summarized below.− In contrast to the existing methods, this paper introduces a developed detection and classification framework, which combines the performance of ED and demodulated waveform classification for test statistic design and utilizes a threshold and waveform feature-based mechanism for real-time detection.The rest of this paper is organized as follows: the system model is presented in section 2. The EDWC algorithm framework is proposed in section 3. The experimental results and discussion are presented in section 4. Finally, conclusions are drawn in section 5.

SYSTEM MODEL
The problem of spectrum sensing is to determine whether a particular part of the spectrum is accessible or not.Therefore, we can express the spectrum sensing problem as a binary hypothesis testing problem at the discrete-time instant : where hypotheses  0 and  1 indicate the absence and presence of the primary signal, respectively, () refers to the signal received at the location of the CR system, () is additive complex white Gaussian noise with zero mean and () represents a signal transmitted by the primary node.

Energy detection
The energy detector contributes to energy evaluations corresponding to the above binary hypothesis.Let () be the -th ( = 1, 2, …, ) sample of ().All the samples are placed into the vector  = [(1), (2), … , ()]  .Typically, the decision statistic (y) based on  received samples can be given by (3): where  is a predefined decision threshold.The reliability correlated with the decision rule in (3) can be characterized by the probability of detection   and the probability of false alarm   .The former is the probability of exposure of the primary signal when it is present in the frequency band and can be formulated mathematically as (4).
The false-alarm probability represents the incorrect decision that () is present in the frequency band when it is actually not, and it may be written as (5).
The decision threshold is the crucial parameter in (3) and must be optimized for each detection technique to enhance its performance.In general, the decision threshold is chosen to make   as large and   as small as possible.The threshold is commonly set based on a constant false-alarm probability as (6): where  is the standard Gaussian complementary cumulative distribution function, noting that the decision threshold must be adjusted based on the variance of the additive noise.

Machine learning
Machine learning algorithms learn a target function f that best maps input variables X to an output variable Y.This objective is expressed for a machine learning algorithm as (7).
Where  is the sample size and  is the number of features for each observation.Each pair of matrix (X, Y) is called a training sample because it is used to guide the learning algorithm how to obtain the predictor f.There are two classical data models that depend on the prediction type.If the outcome variable Y is quantitative, the learning problem signifies a regression problem; if the output variable Y is a definite value, it is a classification problem.
A classification problem is a kind of supervised machine learning task in which an algorithm learns to classify new observations from examples of an output variable.The classification efficiency of machine learning models depends greatly on the selection of the dataset representation or features used for training.In this paper, we use the CTR, DCA, NBC, KNN, and SVM algorithms for training and classifying datasets.

Demodulated waveform characteristics
In this paper, we focus on the signals of amateur radio communication, which are based on frequency modulation (FM).The receiver's demodulated signal is a signal in the audible frequency band or voice signal.The demodulated wave characteristics will vary depending on the nature of the speech or voice.The key variables used to express the values of the critical signals are descriptive statistics and spectral measurements.

Descriptive statistics
Descriptive statistics are used to represent the basic features of a signal.They provide summary characteristics for the signal sample and the measures, e.g., the maximum elements of an array (max), minimum elements of an array (min), average or mean value of an array (mean), median value of an array (med), maximum-to-minimum difference (p2p), root-mean-square (RMS) level (rms), peak-magnitude-to-RMS ratio (p2rms), root-sum-of-squares level (rssq), standard deviation (std), and variance (var).

Spectral measurements
Spectral measurements can represent an electrical properties according to its frequency.Each frequency element included in the input signal is displayed as a signal level corresponding to that frequency band of interest, e.g., the mean frequency (meaf) and median frequency (medf).This paper uses both descriptive statistics and spectral measurement parameters as the classification data features.In the additional content concerning the model training, we demonstrate the feasibility and contribution of the classification data features to the waveform characteristic classification.

PROPOSED EDWC ALGORITHM
The processing pipeline of the proposed EDWC algorithm framework is shown in Figure 1.The pipeline consists of data acquisition, data preprocessing, model development, and classification and decision steps.

Data acquisition
In the present work, the performance of the proposed EDWC algorithm is validated using a combination of Avnet ZedBoard with the analog devices AD-FMCOMMS3-EBZ FMC module.Table 1 presents hardware specifications in a defined range of RF spectra.The proposed algorithms are implemented with MATLAB R2019 in a 64-bit computer with a core i5 processor and 4 GB RAM. Figure 2 shows the experimental setup, where FMCOMMS3 and ZedBoard interface with the system through MATLAB software.The antenna AOR DAG735G is connected to the Rx port of the FMCOMMS3 board and can cover a frequency range of 75 MHz to 3 GHz.The receiving antenna is located at 13.767756ºN, 100.530569ºE, and the height is approximately 20 meters above the ground.For application purposes and for planning the use of the public spectrum, we implemented the developed framework to maintain a one-week cycle usage statistic for FM amateur radio.The available frequency bands for FM amateur radio according to [31]

Data preprocessing
The analogue RF signals at the specified frequency range are converted to the intermediate frequency (IF) and stored for classification processing.The potential predictor variables used in this study are the descriptive statistics and spectral measurements of the FM demodulated signal of each channel, as described in section 2.3.Figure 4

Model development
The purpose of machine learning is to develop a model that makes classifications based on input data or features.A supervised learning algorithm uses a certified set of input data and known corresponding outputs and instructs a model to create logical classifications in response to new data, as described in algorithm 1.The learning process begins with an input data matrix X.Each row of X represents one observation or measurement.Each column of X denotes one feature or predictor.After model fitting, we obtain several models depending on the algorithms.These models will be used to classify the output.In this case, we have two categories: voice or speech waveforms and noise waveforms.

Classification and decision
A measure of energy level will indicate if a signal is transmitted in that frequency band or not.The application of classic ED techniques can provide only  0 or  1 status, as presented in (1).However, in practical applications with radio amateurs, there is also a form of noise transmission.Which the noise is sent out, there will be no audio or speech signal.For example, press and hold the submit key.This method of analysis, therefore, further classifies the form and nature of the measured signal.The developed EDWC algorithm will be beneficial in further applications for security agencies.The process of classifying and making decisions is a combination of the capabilities of ED and the analysis of voice signals using machine learning algorithms, as described in algorithm 2. According to the preprogrammed processing steps, the developed board captures the RF signal in real time.Then, it filters the wideband signal to the subband according to the respective channel and bandwidth size.Next, all descriptive statistics and spectral measurements are calculated to prepare the input row of X.As mentioned above, the power splitting method only provides information if there is a signal in the observed frequency channel or not.Furthermore, once it is identified that some signal power is detectable, it is the process of analysis to classify it as a speech signal or noise.The algorithm is classified into three subgroups, C0, C1, and C2.
The classification models process the input data and classify the waveform features into two groups: (WC = Voice) and (WC = Noise).The ED module compares the energy level with the predefined threshold and gives the comparison results:  0 or  1 .In the decision step, we define the decision output based on waveform classification and ED as follows: − C0 when (T(y) < λ,  0 ) and (WC=Noise): In this case, the signal level is weaker than the regular reference rate, and the resulting waveform characteristics are generally similar to that of a noise signal.− C1 when (T(y) ≷ λ,  0 or  1 ) and (WC=Voice): Suppose the measured energy level is smaller than the specified threshold level, but the waveform characteristics are similar to voice signals.In this case, the decision algorithm will classify the detected signal into the voice group.There is a possibility that the transmitter is at a great distance, causing the signal intensity to decrease.However, the waveform characteristics indicate that it may be a voice signal employed for real communication.In public amateur radio use, there may be accidental or intentional interference by the user.Alternatively, the user may transmit a carrier wave signal without modulation with a speech signal.In this research, a decision making model was designed to take the actual situation into account.In other words, the signal level may be greater than the threshold due to the transmitted carrier frequency.However, the waveform does not have the characteristics of a voice signal as defined in the machine learning model.

RESULTS AND DISCUSSION
In this section, we conduct extensive simulations to verify the performance of the proposed EDWC algorithm.In particular, we evaluate the training performance of the classification scheme in section 4.1.Then, we demonstrate the testing performance and the detection probability of the different algorithms in section 4.2.Finally, we assess the performance of real-time observation applications using real amateur public radio in section 4.3.

Training dataset 4.1.1. Corellation coefficient of features
Based on investigating Figure 5, we find a significant correlation between individual waveform characteristics.Most of the correlation coefficients of the selected features are higher than 0.3; i.e., there is a robust correlation.Therefore, using the waveform properties as variables in machine learning processing can lead to reliable and practical results.

Training duration of different algorithms
The average training durations for the different classifiers according to the size of the training feature vectors are displayed in Table 2.The nearest neighbor algorithm displays a comparatively high training duration (5.0926 seconds for 30000 samples) among all the machine learning algorithms.The algorithm that used the least time to train the dataset in this experiment was discriminant analysis, with 0.3026 seconds for 30000 samples.3 presents the time needed for classification of the waveform characters for different classifiers based on 30000 test samples.The different numbers of samples used in the estimation process are presented in the "number of classification samples" column, from 6000 to 30000 datasets.In the processing used to classify the signal waveform, the proposed EDWC algorithm with decision trees can obtain the most desirable classification time (0.0125 seconds for 30000 samples), followed by the naive Bayes algorithm (0.0168 seconds for 30000 samples) and discriminant analysis (0.0196 seconds for 30000 samples).They also have comparable accuracy rates.Table 3 shows that the proposed EDWC algorithm using an SVM obtains the highest accuracy of 83.6685%; the other algorithms also show a relatively good performance of approximately 83.6%.

Detection probability of different algorithms
The ROC curve is a metric adopted to examine the properties of classifiers.Figure 6 analyzes the performance of individual proposed EDWC schemes in terms of the ROC curves.The true positive ratio (TPR), on the y-axis, indicates the number of outputs in which the actual and predicted classes are identical.The x-axis represents the false positive ratio (FPR), which is the ratio of cases in which the real and predicted labels are different.From the comparison of the curves, we can see that the KNN classifier has the highest prediction efficiency, followed by the CTR and NBC classification algorithms.However, the difference is not very great.It has been shown that combining descriptive statistics and spectral measurements in model development can have a significant effect on waveform classification.

Real-time observation
In the real application experiment, the experimental setup was put in place and captured the RF signals of public amateur radio for a week (11-17 October 2020) in a particular band.shows how the usage of the signals varies in the observation period.These bands are an essential part of determining the threshold level and the level of noise that occurs in each frequency range as well.From the comparison of the graphs, we can see that the frequency range of band one is used the most, and the least active frequency range is band four.

Counting and decision making
Table 4 shows the results obtained from the experiments to process the actual public amateur signal with the developed EDWC algorithm.The results are divided into four main groups according to the frequency range of the detected signal and the machine learning used for processing to classify the waveform characteristics.In addition, the display is divided into five groups:  0 ,  1 ,  0 ,  1 , and  2 .In the case of  0 and  1 , we focus primarily on the level of energy, and we can see that the signal levels were placed in groups of 26388, 34991, 31614, and 7135 records in band 1, band 2, band 3, and band 4, respectively.In frequency band 1, for example, the signals, which are higher than the threshold level and are classified as voice waveforms, are presented in column  1 .With discriminant analysis, naive Bayes, and However, the analysis using decision trees and support vector machines produced very different results.The results are similar across all four frequency bands.The overall results show that the proposed EDWC algorithm can be used in practical applications, especially in measuring the usage rate of each frequency band, including the number of times in which the signal is emitted, such as with disturbance and in case  2 .

CONCLUSION
In this paper, we propose a novel energy detection and waveform feature classification (EDWC) algorithm to allow the detection of speech signals in public frequency bands based on energy detection techniques and supervised machine learning workflows.To further promote distributed decision making, we develop a waveform decision scheme for classifying voice signals and noise signals after the demodulation process by applying descriptive statistics and spectral measurements.We use supervised classifiers such as decision trees, discriminant analysis, naive Bayes, k-nearest neighbors, and support vector machines.The received energy level and demodulated waveform characteristics are considered as a feature vector for classifying the input signal.We evaluate the performance of the proposed EDWC algorithm in terms of the average training duration, classification time, and receiver operating characteristic curves.A simulation and experimental results using real FM broadcast radio signals demonstrate that the application of waveform properties as predictor parameters in machine learning algorithms improves the capability of waveform classification.Meanwhile, the EDWC schemes using discriminant analysis, a naive Bayes classifier, and k-nearest neighbors deliver similar decision outcomes in real-time public RF signal detection and classification.Our proposed EDWC framework can work efficiently and can also distinguish and classify signals.It shows the actual usage rate of each frequency band as well as the number of times a signal is generated with disturbance, which is an indispensable tool for analyzing data and monitoring the public spectrum usage of governments and related agencies.

Figure 1 .
Figure 1.Processing pipeline of the energy detection and waveform feature classification (EDWC) algorithm

Figure 2 .
Figure 2. Experimental setup are divided into four sections as follows: Band 1 between 144.5125 MHz and 144.9875MHz, Band 2 between 145.1375 MHz and 145.5375MHz, Band 3 between 146.2875 MHz and 146.6000MHz, and Band 4 between 146.8125 MHz and 147.0000MHz.Each channel has a bandwidth of 12.5 kHz.

Figure 3 (
a) and (b) illustrate examples of the instantaneous spectrum and the spectrogram of the real FM radio signal, respectively, versus frequency.As shown in Figure 3(a), the spectrum of the RF signal varies depending on the modulated voice signal.

Figure 3 (
b) presents the spectrogram of the same RF signal with a time history of 100 ms.

Figure 3 .
Figure 3. Example from the RF signal dataset; (a) spectrum of RF signal, (b) spectrogram of RF signal

Int
Amateur radio sensing technique using a combination of energy … (Narathep Phruksahiran) 405

Figure 5 .
Figure 5. Heat map of the interrelated features

Figure 6 .
Figure 6.Receiver operating characteristic curve of the proposed classifiers

Figure 7
presents the comparison plots of the energy level of each frequency band.The x-axis indicates the number of samples, and the y-axis represents the size of the normalized upper envelope of each signal sample.Each frequency band has a different energy level for each captured RF signal over time, which  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 12, No. 1, February 2022: 399-410 408

Figure 7 .
Figure 7. Normalized upper envelope of each signal sample Int J Elec & Comp Eng ISSN: 2088-8708  Amateur radio sensing technique using a combination of energy … (Narathep Phruksahiran) 409 k-nearest neighbors, the numbers of signals in this group are approximately the same: 3868, 4171, and 2821, respectively.

Table 1 .
Hardware specifications

Table 2 .
Average training duration for different machine learning algorithms [seconds]

Table 3 .
Accuracy and average classification time for different machine learning algorithms [seconds]

Table 4 .
Counting and decision making for real-time observations