Bio-signals compression using auto-encoder

Received Jul 18, 2019 Revised Jun 28, 2020 Accepted Aug 6, 2020 Latest developments in wearable devices permits un-damageable and cheapest way for gathering of medical data such as bio-signals like ECG, Respiration, Blood pressure etc. Gathering and analysis of various biomarkers are considered to provide anticipatory healthcare through customized applications for medical purpose. Wearable devices will rely on size, resources and battery capacity; we need a novel algorithm to robustly control memory and the energy of the device. The rapid growth of the technology has led to numerous auto encoders that guarantee the results by extracting feature selection from time and frequency domain in an efficient way. The main aim is to train the hidden layer to reconstruct the data similar to that of input. In the previous works, to accomplish the compression all features were needed but in our proposed framework bio-signals compression using auto-encoder (BCAE) will perform task by taking only important features and compress it. By doing this it can reduce power consumption at the source end and hence increases battery life. The performance of the result comparison is done for the 3 parameters compression ratio, reconstruction error and power consumption. Our proposed work outperforms with respect to the SURF method.


INTRODUCTION
IoT technology offers devices to sense the practical surrounding and effortlessly combine the collected information into refined applications that permits for significant enhancements of activities of the person. The main aim of this paper is sensing the human's behavior [1] via IoT device that is wearable like chest straps, smart watches, wristbands, etc. that can be utilized to point-out the specific health and the required fitness of the clients [2]. Wearable devices can be combined into body sensor networks that are wirelessly connected to bring reports of the medical up-to-date though internet. Therefore, allowing anticipation, early examine and special care. Anyways, as they are needed to be tiny and less weight, they are also limited with the resources such as power, capability of the transition and memory.
In this paper, we implement new solution for data processing for the endurance monitoring of ECG signals. These medical records are basically simple to measure, but simultaneously, exceptionally valuable for the already mentioned determination. We assume the achievement of those kind of signals via small wearable machines [3] are worried about the extending the durability of the battery of those devices via compression of lossy signals. We assume the conditions where transmission of ECG signals wirelessly to some specific access point is needed, so that the signals can be preserved on the server and can be used by  [8,9]) Each of these approaches works under certain circumstances and can extract information unobtainable for the others. This and the impossibility to automatically (i.e. without detailed knowledge of the respective system) determine the best method in advance motivates the basic idea of the approach presented in this paper. In [10], they have introduced " A Novel merit for the selection of feature on the basis of rough set theory (RST)", which is motivated by the merit of the correlation-dependent and known as the rough set-based merit is implemented under rough set immediate reduces algorithm to choose a important subset of the features. In [11], the RST was defined for the analysis of data in the recognition pattern, machine learning and the data mining areas. RS is strong tool to identify a related relation amongst the class and the attributes under the dataset. Hence, utilization of RS as the FS step to identify a decrement of the features for a robust and fast categorization of the datasets. This paper giving attention on FS in DNA of the microarrays (also known as gene expression) datasets [12,13]. The microarray datasets (MDs) had the features of real-value same as the real-world datasets. A MDs is a strong example of the large dimensionality. MDs consists of the several features and few numbers of the samples. Moreover, sharing of the classes in binary MDs is unequal and since they are the datasets of class-imbalanced.

Performing the operation of auto encoder on the selected data from the complete data
In paper [14], they introduced a "semi-supervised-stacked-label-consistent-auto-encoder" for reconstruction and investigation of Bio-signals. Old research has given reconstruction and categorizing as two different issues. For the purpose of tele-monitoring, the methodology of reconstruction of bio-signals are hugely depending on the compressed-sensing (CS); they are designed approaches where the formulation of the reconstruction is depended on some prediction about the signal. CS dependent project solutions, a part of the gathered signal (such as one second) on a matrix that is random (sparse binary, Gaussian and binomial) so that, the projected data's size is less when compared with the size of samples that was taken in one second [15,16]. As CS needs only a product of matrix-vector, it costs less for computation. There are many research studies, which implement effective computation hardware and energy effective for same [17][18][19][20][21]. The data that is compressed wirelessly sent to base station. At base station, reconstruction with the help of CS methodology is done for later monitoring and examination [22,23]. There can be many types of general CS methodology for re-construction; [17,24,25] use types of sparse-bayesian learning (SB) [26,27,28]. Many of the papers utilize the typical CS methods for recovery in few of the operations, the reconstruction of the signals was done with the help of intra and inter-channel correlations [29,30].
In this article, they introduce a new argumentative graph, surrounding structure for data of the graph. The structure translates the topological framework and content of the node in a graph for compressed demonstration, on which the training of the decoder is done to reconstruct the structure of the graph [31,32]. Graphs are important tools to recognize and build complex relations between the data. In different types of graph applications consisting protein-protein communication network, citation network and social media networks, data of the analyzing graph is essential in different data mining operations consisting classification of the nodes, prediction of the links and clustering of the nodes [32,33]. Anyways, the parallelizability that is low, complexity of high computations, and incapability of machine-learning (ML) technique to the data of the graph made the operation of analytic of the graph more difficult. Algorithms that are based on factorization of the matrix. For example: HOPE, GraRep, M-NMF [15,29] pre-process the framework of the graph into an adjacency matrix and obtain the embedding by disintegrating the adjacency matrix. Lately, it is represented that several algorithms of probabilistic are equal to factorization of matrix methods. Deep learning methods specifically auto-encoder dependent techniques, are also mainly researched for embedding of the graph. The algorithm of MGAE uses a relegated one-layer auto-encoder to learn clustering representation [34,35].
An auto-encoder that is standardized by graph was proposed by Yu et al. aiming to select the graph to guide the operations of encoding and decoding. Anyways, it is still hard to learn with several non-related patterns in data, and present auto-encoder difference is still not included the units, which are hidden into 2 separate parts, first is work-related and the second one is work non-related.

PROPOSED WORK
In this section we will be explaining what is the aim of our algorithm and how will that be performed, following by the combined model of selecting important data which are relevant and applying auto-encoder in the selected data of hidden layers to compress the data. After that we study about the two algorithms on which our proposed algorithm is based. And also, in detail our architecture is explained. The main concept of auto-encoder is to arrange the data from the non-linear encoder into the hidden layers and the hidden units are utilized as the new data arrangements: Here, ∈ is the arrangement of the hidden units and ̂∈ is assumed as a rebuilt of the given input ∈ . The set of the parameters consists of weight matrices ∈ × , ∈ × and the vectors which are offset are ∈ and ∈ whose sizes are and respectively. is defined as non linear activation-function. The Neural Network which has same input and target data is nothing but an auto-encoder who consist of only one hidden layer, such that min , , , Here, n is the data's size of the sample, is the result after re-building it and ̂ is the target data to which the rebuilt data is compared. A better arrangement of the hidden layer data can be gained with the skills to rebuild the data.
As we studied before, all hidden units, which are at high-level, help in finding the essential data from the given input data while rebuilding the data or arranging the selected important data. Anyways, the importance of these units is not same as the importance of our classification work. For instance, some data are used to build the background image in the object image, which is not required. Those type of data are known as irrelevant data or less important data, which are unwanted for our features, which have been trained newly. In parallel, meanwhile the old method, which is not supervised, had a capacity with some limits to reconstruct the input distribution marginally for the aim-supervised processes. On the hidden units, few existing works made the label information with the help of Soft-Max Layer by "Socher et al." in the year 2011. Making count of the recent approximation regarding process units, which are not relevant, it is not okay or also to declare all the hidden units, as they are useless.
So we have two outcomes: 1) one is to select the important data required out of irrelevant data with the help of Feature selection approach, and 2) the selected important data only has to combined and sent by the auto-encoder. With the above study, we implement our combined selection of important data and compressing the selected data using auto-encoder algorithm in an integrated structure.

Bio-signals compression using auto-encoder (BCAE)
In this part, we implement our algorithm by combining both the concept of feature selection to select the important data and auto-encoder to compress the data within a single algorithm. Explicitly, we apply feature selection approach on the units of hidden layer data. Assume ∈ 2× is the set of training elements, where 2 is the size of the visual descriptor and n is the amount of samples of the data.
Where ( + ) is assigned as ( ) and ( ( ) + ) is assigned as ( ( ))and and are each iteration's data reproduction of column of and respectively. The standardized form of selection of important data is given as ( , ( )), along with i.e. matrix consisting trained feature selection data performing on ( ) which is represented as hidden units. Explicitly, in , the vector of ℎ column is represented as ∈ which is represented as, Here, the hidden units' amount is denoted as and represents which column vector chooses the -th unit from the sub collections of the features selected ∈ . Then the method of feature selecting can be represented as the real set of features ( ), looking for the Q-matrix to choose the new features and collect into a set = ′ ( ) that optimizes the relatable criterion ( , ( )). Generally, operation for selection of data that are necessary can be takes place in 3 ways: unsupervised, supervised, semi-supervised. Coming to supervised methods, most of the times it consist of irrelevant information while selecting the necessary features such as fisher score that was introduced in the year of 2012, by Han, Li and Gu. In the year of 2005, Niyogi, He and Cai introduced laplacian score for unsupervised method. Unsupervised methods is assembles only the relevant and needed data while selecting main features. Trying to work with various cases in the actual world, we simplify the ( , ( )) in simplest method. Explicitly, we introduce the regularizer for the selection of features in general as: That offers a basic method to adjust with various cases by selecting and in various methods. Our proposed system follows the fisher score supervised approach in order to collect the information in supervised manner. For fisher score, two graphs and which are undirected are built using provided data (here, we take help of the input data P from the original one to keep the geometrical framework safe while selecting the features), which correspondingly effects the affinity relationship of the between-class and inside-class (introduced by-Yan et al. in the year-2007). In the same manner, two weighted matrices and are generated to categories both the graphs. Thus, we get the Laplacian matrices denoted as = − , here is the ′ diagonal matrices, in the same way for and . By equating below issue of optimization, we can get the selection of feature matrix Q that generates the subset of the features with the low criterion score: Unwantedly, there has not been a direct problem to resolve the given above trace-ratio issues, because of the absence of closed-form resolutions. Therefore, in place of straightforwardly dealing with trace-ratio issues. Many jobs desires to change to similar trace-various issues to gain a global resolution for optimization (Proposed by Nie et al. in year-2008). Assume standard score for subset-level is ( , ( )) in the (5) This is to show that, To this stage, we can represent the function of while using others as constant as: Thus, looking for the global optimal can be transformed into equating the equation's root ( ) = 0, which is trace-variance issue. See that ( ) is a function, which increases monotonically. By proposing the trace variance optimization issue, which is illustrated above in the (8) within the auto-encoder's hidden layer updating, we equate our final aim function as: Here, is the parameter used to balance which is placed between the selection of the feature and autoencoder term. Denotes the ratio of trace's score after gaining the optimization Q and here Q is the trace ratio of optimization of recent issue.

Enhancement
Formula (9) is bit difficult to resolve because of its complex and non-linearity of the decoders and the encoders, therefore another way for enhancing is used to renew the parameters of the auto-encoder , , , and variable of selection of the feature Q and also each time. Explicitly, we resolve the optimization with 2 sub-issues. First is learning the score of the selection of the feature and the second one is the optimization of the auto-encoder which is regularized.

Learning the score of the selection of the features
When the auto-encoder's parameters are made constant, we can enhance the selection of the features score and matrix of selection of the feature Q in an old method of trace ratio. Explicitly, we use trace variance equation Assume is the outcome after optimization in the k-th iteration of optimization, therefore is evaluated by Thus, we can get ( ) as where +1 maybe evaluated proficiently with respect to score rank of every single feature. The equation's root ( ) = 0 and the enhanced solution for (6) can be gained via its repeating process. See that is renewed as the enhanced score which is global for the criterion of the selection of the feature, and operates as a parameter in succeeding iteration of auto-encoder. Algorithm 1 explains the solution of enhancement.

Algorithm 1: Enhancement for the issue of Trace-Ratio.
Input: Trained features of the hidden layer ( ), relevant data selected from the given input data , matrices and . Initialize: assign Q with I, where I presents the identical matrix and it belongs to with Equation (11), = 10 −9 , = 0, = 10 3 IF ≤ and non-converge do  Calculate score of every single feature of -th iteration with the help of Equation (11) by assigning = [0, … ,0, 1, 0, … ,0] '  Arrange the features according to their score in incremental order.  Choose the highest feature to renew ∈ ×  Evaluate with the help of Equation (11)  Convergence constrains is checked: ‖ − ‖ <  Start again from IF condition.

End of IF Output:
is the matrix of selection of the feature and is the enhancement score globally.

Enhancement of the auto-encoder which is regularized
When Q and are made constant, we can use stochastic-sub-gradient-descent technique to get , , , . The slopes of the aiming function in (9) according to the parameters of the decoding is calculated as given below: Then, , , , can be renewed with the help of the algorithm of gradient descent as given below: Here, the rate of learning is denoted as . and are the column mean of and respectively.
To add, the two part of the issues given above could be renewed recursively. Algorithm 2 explains the enhancement in detail.

Algorithm 2:
Resolving the Issue in (9) Input: parameters , Learning Data P, Layer_size, number of features selected < .

Initialize:
= 50, = 0, = 10 −7 . While ≤ and non-converge do  Keep others constant and renew and with the help of Equation (10).  Make as constant and renew 1 , 2 1 , 2 with the help of Equation (13).  Check for the constraints of convergence: ‖ − ‖ ∞ < . End Output: , , , , ( = ′ ( + ))Can be utilized as an input next for our algorithm, to form the framework of stack. Examine: the representation of the new features can be done as:

EXPERIMENTAL RESULT AND ANALYSIS
We represent the quantitative outcomes to estimate the efficiency of modified auto-encoders in the biomedical signal compression. MIMIC-II [21], the database can be available in 2 forms. In 1 st form, the interested researchers can get the text version of flat-file of medical database and also the scheme of associated database which permits to recreate the database utilizing their technique of choice. In 2 nd form, attentive researchers can obtain access to database via secure with password web services. The database researchers need the consumers to explain themselves with the help of database layout to the queries of program database utilizing the SQL (Query output can be exported to comma-separated files to be analyzed offline). The query output is transported to the files as comma-separated to be examined offline utilizing statistical or another software. The data is processing and accessing from the MIMIC-II, which is very complex. It can be highly suggested that the studies are based on the MIMIC-II database that can be directed as collaborative efforts, which contain statistical, relational and clinical database expertise. At the end, we utilized Physio net MIMIC II database [21] that consists information traces for the ECG. Additionally, we assumed our ECG traces, which developed via the Zephyr's Bio, harness 3 heart rate monitor. We represent the chosen outcomes which is gotten from the total of 16985 like sequences from 3 various Physio net of patients as well as 1400 for our method of ECG measurements and the test is picked so as to cover the important range of situations. Figure 1 represents the actual signal graphs and recreated the waveform of ECG signal with trace 100. As we can see the graph, the actual signal was recreated successfully with the help of proposed model compression. The modified auto encoder (MAE) was worked in recreating all R in actual signal with the help of amplitude difference. 17 fragments with trace 100 were utilized in test phase. Figure 2 represents the actual signal graph and recreated the ECG signal waveform with trace 100. Whereas, we can cee that the graph, that actual signal was recreated successfully at receiver with the help of proposed model compression. The MAE was successful in recreating the actual signal with the help of little difference and as the hidden nodes were maximizing the reconstruction accuracy also increases. Figure 3 represents the actual signal graph and recreated the ECG signal waveform with trace 100. From the graph, we can see the development in the reconstructed signal with the actual signal. With maximizes the hidden nodes the accuracy of reconstructing has improved when compared with hidden nodes 2 and 4. Figure 4 shows the improvement in the reconstructed signal of ECG signal at receiver with that of actual signal. In the above figure uses 24 hidden nodes for training and reconstructing the signal. The reconstructed signal is having negligible amplitude difference with respect to original signal. Figure 5 represents the actual signal graph and the recreated the waveform of an ECG signal with trace 100. As we can see from the graph, the actual signal was successfully recreated with the proposed model of compression for hidden nodes 48. The modified auto encoder model succeeded in reconstructing with that of actual signal with zero amplitude difference.  Figure 6 shows the root mean square error vs energy consumption (joule per bit) using our proposed model and SURF [24], where our modified auto-encoder got less RMSE w.r.t SURF [24] approach. Figure 7 shows the RMSE (%) with respect to Compression Efficiency where our model has performed significantly well w.r.t SURF [24] approach. Figure 8 represents the RMSE with regards to the various number of the Hidden Nodes, where our model has performed significantly well w.r.t SURF [24] approach.

CONCLUSION
In this research work, we have presented Bio signal compression using auto encoders (BCAE) compression algorithm for wearable fitness monitors using IoT. We proposed the ECG-based biometric system whereas the lesser dimensional of non-linear representations of the heartbeat templates are learned through the deep modified Auto encoder. This study also represents the efficient signal of ECG compression method which is utilized in biomedical field i.e. e-health applications, Holter systems and telemetry. The proposed method of compression integrates the architecture of MAE, which have recently become very popular in the machine learning (ML) field. Therefore, it can be possible to get the data in the deep layers about the actual input signal, consisting lower level to higher level of features. In this study, the deep structure of MAE is proposed, 2000×1 ECG signals were recreated successfully by showing only 62 × 1 dimension. The application of comprehensive was made on the 4800 fragments of 2000 samples from the 48 patients in the dataset of MIT-BIH arrhythmia. With the help of proposed model, the important compression outcomes were gained with the rate of CR 32.25 and an average rate of PRD 2.73%. The proposed methodology also has the structure which can be utilized safely to transfer the biomedical information to remote locations securely.