Review of high-speed phase accumulator for direct digital frequency synthesizer

ABSTRACT


INTRODUCTION
Phase accumulator can be defined as a digital circuit of adder and register that used to generate the digital values within (0-2π) range. Figure 1 describe the detail architecture of the PA circuit. The construction of Figure 1 describes that the adder sequentially accumulates the data control word. DFF register is defined as a storage element of the PA circuit and used to frequently repeat the feedback data to the adder in order to provide the desired digital values as a phase output (saw tooth shape). The phase output values steadily incrementally growth in the phase of (0: 2 N -1) with every timing pulse. The phase accumulator output can be clarified as a phase wheel in the direct digital frequency synthesizer (DDFS). It is used to represent the phase to amplitude converter (PAC) of the DDFS. The PAC digital values (also called read only memory (ROM)) are addressed regarding to the desired increment phase angle. The digital phase wheel is shown in Figure 2. The phase angle incrementally jumping with constant speed around the wheel of (0-360). One complete phase wheel cycle produces a digital sine wave in the range of (0-2 π). This research addressed several approaches and techniques that able to generate fast PA throughput as well high-frequency resolution.

PIPELINIED PHASE ACCUMULATOR DESIGN
PA circuit is the component that used to provide the digital output values. The PA speed performance contributes the DDFS system. Pipeline technique is a basic technique that used to enhance the accumulator speed, this coming by splitting the main unit into a number of sequentially subs-units [1]. Pipelined PA can be classified into two types, namely; pipelined PA circuit with a single bit-layer, and with group of bit-layer of registers-adders. The block circuit diagram in Figure 3 shows the pipelined phase accumulator with single-bit layer. The mono-bit pipelined phase accumulator is used in the DDFS circuit with small PA input bits (<11-bit). As an example, in [2][3][4], this technique is applied to produce 9-, 12-, and 8-bit pipelined PA respectively.  Another pipelined technique without pre-skewing registers in the pipelining levels is used in [5] to construct 12-bit PA. In addition to the XOR gates, a different types of speed registers are to complement the most-significant bits (MSB) values. A new pipelined PA is applied in [6] with 9-bit input. In this design, the 8-bit phase output can be obtained by complementing the high (9-th) most significant bit (MSB) to provide the output values. The benefit of the proposed pipelining technique with single-bit adder is clearly increasing the speed performance. However, this benefit accompanied by increasing the DFF registers which lead to increases the power consumption. The DDFS frequency resolution is depended on the frequency operating speed and the input bits' number (N) of the PA.

= 2
(1) As shown in (1) shows that the DDFS resolution is calculated by the operating speed and the binary input bits N. As shown in (1) describes that the frequency resolution much better as much as the input bits larger. Due to that, the designers choose accumulators with a large bits input (24)(25)(26)(27)(28)(29)(30)(31)(32) bits to obtain the desired frequency resolution for the DDFS system.
As we mentioned earlier, pipelining technique is a suitable technique for designing the fast accumulator. A using of large frequency control input bit for the pipelined PA required a group-bit stage register-adder to accomplish the suitable phase accumulator. The circuit diagram of pipelined PA with multi-bit layer is shown in Figure 4.  An example of a large bit input is applied in [7] by Yong S. Kim to implement the pipelined PA with 32-bit. The presented design is used to achieve high-speed, as well high accuracy resolution. The clock partition approach is applied to reduce the DFF registers. This technique is used in [8] to architect pipelining PA with unequal pipelining stages (12-7-7-6 bits). The lower stage has been chosen with the high bit (12-bit), in order to avoid the growth of repetitive registers in the higher pipelined PA layers.
The phase jitter technique was applied in the classical pipelined PA circuit with 32-bit in [9]. Because it is the large bits input is used to enhance the DDFS resolution, therefore, PA output is truncated to the appropriate bit numbers that required to address the ROM phase angle. The suggested phase jitter injection is used to reduce the truncation error of DDFS output.
A new technique is used to design the 24-and 32-bit of pipelined PA in [10,11] respectively, to minimize the DFF registers. The comparator block consists of OR and XOR gates detects the change in FCW and enable the output signal to the loading generator. The circuit diagram of the proposed technique is shown in Figure 5. The figure describes how is the first column LD0 be activated by the loading signal, the other (LD1-LD6) loading signals were being activated sequentially. Applying the proposed technique in the PA circuit can improve speed, and resolution a with minimum DFF registers.

PHASE ACCUMULATOR DESIGN BASED ON FAST ADDERS
A well-known that adder play a major role in increasing the PA speed. The conventional ripple carry adder (RCA) is the most used adder for the DDFS system. As an example, A single-bit adder in [12], is used to implement the 24-bit DDFS system with a minimum DFF registers. Carry look-ahead adder (CLA) defines as a sample of the widely used fast adder. This adder (CLA) is used in [13] ] with groups of 3-bit adder in two levels to design 9-bit PA. The CLA small group blocks are applied to avoid the circuit complexity. C Ekroot and S Long are used the 4-bit CLA circuit adder in [14], to design the 16-bit PA and enhance the DDFS performances.
A known that prefix adders defines as a fast adder and used for binary addition in the PA design [15] for fast computations. The prefix concept defines as follows: The addition of several bit can be achieved as a time. The first level of the prefix adder circuit consists of AND and XOR gates, as a propagate (p), and generate (g) functions respectively. Figure 6 shows the circuit diagram of the prefix adder. A number of adders have been designed based on the prefix concept and used for binary addition in the phase accumulator designs. An example of these adders, as KS [16], Sklanski (SK) [17], Brent-Kunt (BK) [18], and Beaumont-Smith (BS) adder [19]. g p g p g p g p X (n-1) y (n-1) x 1 y 1 x 0 y 0 C in S n S 2 S 1 S 0 C out x n y n Figure 6. The prefix adder circuit

Modified brent kung (BK) adder
The architectures of 8-, 12-, or 24-bit of the Brent-Kung adder is available in blocks. The designers can use the BK block circuits as a fast adder. Pipelined PA requires the adder with carry in-out that sequentially pass through the accumulator stages. Due to that, a modification has been applied on the first bit adder in order to be compatible for pipelining procedure design.
R. Zimmermann in [20] was explained the prefix addition algorithm. Based on the explanation and [21], the BK adder in this review is modified to be suitable for using in the pipelined PA circuit in [22]. Figure 7 shows the modified BK adder circuit diagram. The figure describes the modification is done by removing g0 logic gate and insert a 2:1 multiplexer [23]. P O is used as a control selector of the multiplexer inputs X 0 and C in .

ARCHITECTURE OF PIPELINED PHASE ACCUMULATOR WITH DIFFERENT ADDER
Pipelined phase accumulator was designed based on RCA, modified BK adder, CLA adder, parallel-prefix adders such as SK, KS and BS adders. The comparison of the different adder was done for multiple input control words (12-, 18-, and 32-bit) to select the fast adder. The designed PA circuits were ceded Verilog hardware description language (HDL), elaborate, synthesized and verified with FPGA kit board (Cyclone III). The comparison results in the terms of maximum frequency operation in Table 1 shows the performance of the KS adder in lower bits and modified BK adder for the high number are faster adders.

PIPELIND PA DESIGN WITH THE GATED CLOCKING TECHNIQUE
The prerequisites of the rapid PA design circuit with low-power consumption and small dimension area, demand techniques that improves the throughput yield while decreasing the number of logic cells. This trade off can be accomplished by utilizing the pipeline technique to enhance the PA speed. The disadvantage of this technique is to exponentially increase the growing of the DFF registers with expanding the pre-skewing registers of pipeline layers as shown in Figure 8(a).
To overcome the unwanted repetitive register while preserving the high speed, an appropriate method is applied to reduce the DFF registers; namely gated clocking technique. The basic idea of this method is to utilize the DFF register with Set feedback and high D input [23][24][25], shown in Figure 8(b), connect to another DFF register to divide the clock pulse in order to prepare it for the next layer. For the other pipeline stages, a single DFF register is suitable for the proposed design. Applying the presented gated clocking technique can remove all the gray blocks of the pre-skewing registers (48 DFF) in Figure 8(a) and replace it with 4 DFF registers in Figure 8(b). Combining the above-mentioned techniques together in the presented design as shown in Figure 9 will overcome the goal of the designers' target by achieving a high-speed PA design circuit and relatively decrease the number of registers (44 of 121 DFF registers). The proposed technique has demonstrated a 36% reduction in the term of the DFF registers.  Pipeline PA design with the gated clocking technique was coded Verilog code, synthesized and elaborated using Quartus II programming software. Then, the gate level simulation has been done using cyclone III field programmable gate array (FPGA). The PA coded circuit operates with frequency speed of 286.29 MHz. The simulated results of the PA design shown in Figure 10. The figure declare the sawtooth shape of the PA output which is exponentially growth (0-2 ) and match the mathematics results.

CONCLUSION
This research investigates several approaches and techniques to achieve high-speed and high accuracy resolution. The modification has been applied to the BK adder to be used in the pipelined PA architecture. In addition to the CLA and RCA, multiple prefix adders such as; SK, KS, BS and modified BK adder were used in the design of the pipelined PA architecture to select the fast adder. The proposed PA circuits were coded, verified and simulated using Quartus II software. The comparison result shows BK adder perform higher-speed than the others. Based on the above good feature of the BK adder, it is used in the pipelined phase accumulator design. The clocking technique was utilized to the PA circuit in order to decrease the logic gates while preserving the high-speed throughput. The coded design of the proposed PA was verified with FPGA kit platform. The achieved results show the PA circuit operates with frequency speed of 286.29 MHz. Further, the proposed design has demonstrated a 36 % reduction in the term of the DFF registers.