Efficient systematic turbo polar decoding based on optimized scaling factor and early termination mechanism

In this paper, an efficient early termination (ET) mechanism for systematic turbo-polar code (STPC) based on optimal estimation of scaling factor (SF) is proposed. The gradient of the regression line which best fits the distance between a priori and extrinsic information is used to estimate the SF. The multiplication of the extrinsic information by the proposed SF presents effectiveness in resolving the correlation issue between intrinsic and extrinsic reliability information traded between the two typical parallel concatenated soft-cancellation (SCAN) decoders. It is shown that the SF has improved the conventional STPC by about 0.3 dB with an interleaver length of 64 bits, and about 1 dB over the systematic polar code (SPC) at a bit error rate (BER) of 10 −5 . A new scheme is proposed as a stopping criterion, which is mainly based on the estimated value of SF at the second component decoder and the decoded frozen bits for each decoding iteration. It is shown that the proposed ET results in halving the average number of iterations (ANI) without adding considerable complexity. Moreover, the modified codes present comparable results in terms of BER to the codes that utilize fix number of iterations.


INTRODUCTION
Recently, polar code (PC) [1] has attracted a lot of attention in the source and channel coding research field [2]. This is mainly due to its ability to achieve channel capacity over the binary discrete memoryless channel with minimal encoding and decoding complexity. However, the acceptable performance of a PC can only be realized with a large code length [1], [3]- [11]. The main goal of this study is to enhance the bit error rate (BER) performance of PC with finite code length. The reasons behind the degradation in BER performance of PC with short length are mainly due to the error propagation in successive cancellation (SC) [1] decoding and the low minimum distance of PC [12].
Turbo decoding [13] is one of the well-known efficient techniques that are used to treat performance deficiency due to finite code length. A similar scheme suggested by [13] comprises a parallel concatenation of two systematic polar code (SPC) [14], namely called systematic turbo-polar code (STPC), as proposed in [12], [15]- [17]. These studies presented an enhanced BER performance compared to the original SPC. In addition to SC, other decoding algorithms have been suggested for PC and SPC, including successive cancellation list (SCL) [5], [8], successive cancellation stack (SCS) [18], soft successive cancellation list (SSCL) [17], belief propagation (BP) [19] and soft cancellation (SCAN) [9]. However, due to their hard decision output feature, all but SSCL, BP, and SCAN soft-in-soft-out (SISO) algorithms can be applied to the STPC iterative decoding structure. Unfortunately, the unreliable a-posteriori information and the correlation between intrinsic and extrinsic sequences traded between the two constituent decoders result in a degradation in BER performance [15], [16], [20], [21]. One of the most efficient remedies to this issue is the multiplication of the extrinsic information by SF to reduce their optimistic values. The SF optimization was discussed in the literature for other iterative codes like turbo codes (TC) and low-density parity-check code (LDPC) [22]- [25]. Several previous works have discussed the optimization of SF for STPC. In [16], a concatenation of recursive systematic convolutional code (RSC) and SPC is proposed. The BP decoder is used as a SISO decoding algorithm for SPC and the BCJR algorithm for decoding RSC code. The value of SF is given between 0 and 1 and is set small at the beginning and increases with the number of iterations. As a stopping criterion, the matching between the transmitted and decoded fixed bits is tested for each iteration. The minimum weighted mean square error criterion is proposed in [20], [21] to optimize the SFs of STPC that utilize SCAN/BP component decoders. Genie-aided decoders are proposed as an ideal reference to practical decoding. The SFs are estimated offline for each half iteration and certain range of signal-to-noise ratios. This requires accurate channel estimation which is not always a tractable task. The major contributions of this paper are summarized as follows: i) two SFs for each constituent SCAN decoder in an iterative STPC scheme are proposed and ii) an efficient early termination mechanism is introduced based on the values of the estimated SFs.
The paper is organized as follows: section 2 reviews the encoding and iterative decoding of STPC. The algorithm of optimized SF for the STPC is proposed in section 3. In section 4, two schemes of ET mechanisms are introduced. The numerical results of fixed and early terminated iterative turbo-polar decoding are presented in section 5. Section 6 concludes this paper.

PRELIMINARIES 2.1. Systematic turbo-polar encoding
In this work, it has been suggested that the STPC consists of a concatenation of two parallel systematic polar encoders [15] joined by an interleaver. The encoding structure of STPC is shown in Figure 1. Let = [ 1 , 2 , … , ], is the information sequence of length , where ∈ {0,1}. Suppose denotes the code length of component polar code, subset ℱ ⊂ {1, ⋯ , } is the -elements of free bits, ℱ ⊂ {1, ⋯ , }/ℱ. The selection of information set ℱ is based on the Bhattacharyya bound approximation [1]. As the case with any linear code, the codeword produces by encoder ( = 1 2) using a combination of linearly independent bases forming the rows of the generator matrix .
Where, is the bit-reversal permutation matrix, 2 = [ 1 0 1 1 ], 2 ⨂ denote the th Kronecker product and = 2 . For the first component encoder, the vector 1 can be split into free ( 1,ℱ ) and frozen bits ( 1,ℱ ). The coded vector 1 consisted of systematic ( 1,ℬ ) and parity bits ( 1,ℬ ), where ℬ is the set of indexes corresponding to systematic bits. Usually, ℬ and ℬ are identical to ℱ and ℱ respectively. Accordingly, the (1) could be rewritten as (2), (3): ℱℱ refer to a submatrix of with rows and columns indexes of ℱ and ℱ respectively. Other submatrices are similar. For systematic code, 1,ℱ = and 1,ℱ is set to zero vector. Using this assumption and from (2) and (3) one can calculate the parity bits as (4).
It is easy to prove that ( ℱℱ ) −1 = ℱℱ . The complete codeword for the SPC is given by = { 1,ℱ , 1,ℱ }. The input sequence for the second encoder is the interleaved version of the information vector denoted by ( ) . The overall codeword for the STPC at the multiplexer output is given by = { 1,ℱ , 1,ℱ , 2,ℱ }. The total rate of the concatenated codes = / , where the codeword length of is = 2 − . The modulated symbols corresponding to the coded bits are as follows: = 1 − 2 where s ∈ {−1, +1}, assuming that a binary phase-shift keying modulation is applied. The noisy received sequence at the channel output is given by (5), where is i.i.d Gaussian noise with zero mean and variance 2 = /2.  Figure 2 depicts the iterative decoding structure of STPC [15]. The principle of decoding is similar to the conventional parallel concatenated convolutional decoding. The following are the steps of decoding procedures.

Systematic turbo-polar decoding
-Step 1: The de-multiplexer split the received sequence into three parts; the sequence of the received message , , and the received parity check sequences 1, and 2, from the SPC1 and SPC2 encoders, respectively. -Step 2: For each iteration t, the first decoder SCAN-1 receives the sequences , , 1, and the a priori information 1, , provided by SCAN-2 decoder, and produces the a-posteriori sequence 1, . The redundant information, , and 1, , relative to SCAN-2, should be removed from 1, to produce the extrinsic information ℰ 1, which is in turn provided to the second decoder SCAN-2 as (6).
-Step 3: The second decoder SCAN-2 produces its likelihood ratio (LLRs) 2, for the sequence of information bits with the aid of , , 2, and 2, . The extrinsic sequence ℰ 2, produced by SCAN-2 is given by an equation similar to (6). Finally, the iteration is completed by the estimation of 1, which can be presented by (8).
where −1 represent the de-interleaving mapping. The evaluation of the SFs 1, and 2, are discussed in the next section. Steps (b) and (c) are repeated iteratively until a maximum number of iterations is reached. To avoid excessive iteration decoding, a stopping mechanism could be added to terminate the iterations before reaching . In section 4, a new stopping criterion is proposed based on the value of

PROPOSED SCALING FACTOR
In this work, two scaling factors, 1 and 2 have been used to scale the extrinsic information before passing it to the other constituent decoder. These factors reduce the correlation between the intrinsic and extrinsic information traded between the two constituent SCAN decoders. This technique is effective in improving the performance of convolutional turbo codes [23], [24], [27] as well as LDPC codes [22], [25]. The performance of STPC, in terms of the bit error rate (BER) and the average number of iterations (ANI), is expected to be enhanced by using the same technique. The two adaptive SFs are computed online for each iteration based on the correlation coefficient (CC) between the extrinsic and a-priori information. The gradient of the regression line ( d t ) which best fits the distance between , and ℰ , for decoder ( ∈ {1,2}) is a good measure of the correlation tendency and is given by [27]: where ℰ ̅ , and ̅ , are the average values of ℰ , and , respectively. To make the correlation between , and ℰ , tends to zero, the linear transformation method is applied [27]. The modified LLR values are given by (11), where ℛ is a reduction factor and is set to 0.67, which is used to elevate the over-prediction of the estimated value of d t . The value of ℛ was calculated using intensive simulation tests and was shown to give a good compromise between BER and ANI over the defined range of E b N o ⁄ and code lengths. The a-posteriori LLR values of the first constituent decoder for the t th iteration become: and for the second decoder is given by (13):

PROPOSED EARLY TERMINATING DECODING
For many decoding sessions, the decoder successively finishes the decoding even before reaching the maximum number of iterations, especially at high ⁄ . Therefore, an ET technique should be employed to avoid excessive iterations. Because the values of are a measure of the correlation between
In this paper, the offline estimated values of 2 6 are used as a threshold value in the ET mechanism for each ⁄ . For each received frame, if the estimated value of 2 at iteration less than or equal 2 6 the iteration process should be stopped. Otherwise, the decoding process will continue until the is reached.
continue until the is reached } Since the values of frozen bits ( 2, ) transmitted through bad channels are fixed (i.e., known to both encoder and decoder), the efficiency of the proposed ET criteria can be further improved by comparing their values to the corresponding decoded bits (̂2 , ) at the second constituent decoder (SCAN-2). Therefore, adding this condition to the one in (15) results in an efficient stopping criterion as (16).
continue until the is reached } (16)

RESULTS AND DISCUSSION
In this paper, intensive simulations are carried out over the MATLAB R2020b platform to prove the effectiveness of the proposed schemes. The performance of different systems utilizing the proposed algorithm to estimate the SF was compared to those adopted by [20], [21]. The specification of the simulated STPC is given in Table 1.

Simulation of fixed iteration decoding
In this section, all the systems are simulated using a fixed number of decoding iterations ( = 6). The usefulness of using the reliability values , instead of ℐ , in the estimation of the SFs (given by (15)) for STPC was first investigated. Figure 3 shows a comparison in terms of BER performance versus ⁄ in dB for two STPC schemes; the first applies the , (denoted by ) and the second applies ℐ , ( ) in the estimation of scaling factors according to (14). The original turbo-polar code (without SF) is also simulated as a benchmark system to show how much gain can be obtained when the proposed weighted scheme is applied. It is noted that the system outperforms by about 0.05 dB and presents an improvement of about 0.3 dB relative to the original at BER of 10 −5 . To confirm the benefit of using the proposed scheme, the CC between ℐ , and ℰ , for every half iteration and various values of ⁄ is computed and depicted in Figure 4. The curves present a reduction in the correlation for the system compared to . The benefit of using , instead of ℐ , is not confined to the achieved improvement in BER performance but also to reduces the complexity, processing delay, and hence the consumed power, since we do not need to calculate and store the values of ℐ , in a dedicated memory.
To improve the data throughput and reduce complexity, offline estimation of SFs is usually proposed. Table 2 illustrates the offline estimated values of SFs, for different values of ⁄ and iteration count. It also shows the average of the SFs overall iterations for each ⁄ , i.e., = { }.  Figure 5 presents a BER comparison between the proposed online (optimal-proposed) and offline (offlineand offline-) systems as well as the original (without SF), and the proposed scheme by [20] (denoted by optimal-liu2018). As can be observed, the two proposed schemes; optimal-proposed and offlineoutperform the system optimal-liu2018 by about 0.1 dB at BER of 10 −5 . The optimal-proposed system presents an improvement over the unscaled (original) system by about 0.3 dB, whereas it outperforms the SPC (128.64) by about 1 dB at the same BER.

Simulation of early terminating decoding
Different simulation tests are carried out to present the effects of applying the ET on the performance of STPC systems. Figure 6(a) shows the BER and the ANI for STPC that has two SPC (128.64) components and utilizes the proposed ET schemes presented in section 4. The same system with six iterations (without ET) is also simulated as a benchmark. It is obvious that the system that adopts the two conditions in (16) (denoted by BER/ANI-ET-Z) outperform the one that utilizes the single condition in (16) (BER/ANI-ET) in term of BER by about 0.1 dB at BER of 10 −5 . The two schemes (ET and ET-Z) present a comparative value of ANI (on average 3.5 iterations) over the tested range of ⁄ . The variation of ANI depends on the estimated values of 2 which in turn depends on the estimated gradient of the regression line given by (10).
The same systems in Figure 6(a) are re-tested for N=256 bits and depicted in Figure 6(b). The SPC (256.128) is also simulated for comparison purposes. Again, the system with BER/ANI-ET-Z presents an improvement of about 0.1 dB in terms of BER compared to the system with BER/ANI-ET at BER of 10 −6 , with approximately identical ANI. The system with ET-Z presents an improvement of about 0.8 dB at BER of 10 −6 compared to the SPC. As can be seen from Figures 6(a) and 6(b), the systems that apply the ET mechanism reduced the number of iterations by approximately half, which allows for higher data throughput and less consumed power.

COMPLEXITY ANALYSIS
Since the STPC utilizes a couple of SCAN decoders, roughly one can declare that it consumes twice the number of operations compared to a single SCAN decoder in every iteration (ANI × 2 × × ) [34]. Where, is the internal number of iterations made by the SCAN decoder (in this work = 1). Further complexity due to the computation of , scalar multiplication and comparison which are given by (12) to (14), and (16) respectively, should be considered which in general has less significance than the SCAN complexity itself.

CONCLUSION
In this paper, a modified STPC iterative decoding scheme is designed using two weighted and parallel concatenated SCAN decoders. The SF for addressing the correlation issue is estimated using the gradient of the regression line that best matches the distance between the a-priori and extrinsic information. This technique has benefits over the one that estimates SF using intrinsic rather than a-priori information in terms of BER, complexity, and processing delay. As a threshold limit for the new proposed ET criterion, the average values of the SFs determined offline in the last iteration for the second decoder were employed. To improve the efficiency of the proposed ET scheme, the congruence between the transmitted frozen bits and their corresponding decoded bits is added as another condition to the proposed criterion. The findings obtained from the proposed decoder proved its superiority in terms of BER and adaptation over the results of the proposed system adopted by the previous work, which suggests a genie-aided based SFs optimization algorithm. Further reduction in computational complexity and latency is made by estimating AF offline, and thus the proposed scheme is reduced to a multiplication by a constant value.