# Design of a high-speed 7.2 Gbps/lane receiver for MIPI D-PHY interface utilizing 18 nm FinFET technology

#### Trang Hoang, Anh Nam Ha

Faculty of Electricals-Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), Vietnam National University Ho Chi Minh City (VNU-HCM), Ho Chi Minh City, Vietnam

#### ABSTRACT **Article Info**

#### Article history:

Received Dec 17, 2023 Revised May 22, 2024 Accepted Jun 5, 2024

#### Keywords:

Continuous time linear Equalization High-speed receiver Physical layer Receiver

This study presents an advanced design for a high-speed receiver tailored for the MIPI D-PHY Interface, capable of handling data rates up to 7.2 Gbps per lane. The design is developed using 18 nm fin field-effect transistor (FinFET) technology and is rigorously simulated under varying process, voltage, and temperature conditions (PVTs) to ensure robustness. The architecture of the receiver integrates several key components: differential pair sensing, a folded cascode continuous time linear equalization (CTLE), a single-ended operational amplifier, and a cross-coupled stage. Operating at a supply voltage of 0.72 V in the worst-case scenario, our CTLE achieves a peaking gain of 17.77 dB at 4.26 GHz. The design demonstrates a maximum jitter of 19.63 ps at an offset voltage of ±2 mV. Notably, the power efficiency of our receiver is optimized to 0.85 mW/Gb/s, totaling 6.1 mW, with dual supply voltages of 1.98 and 0.88 V. This work contributes to the field by offering a highly efficient solution for fast data transmission with reduced power consumption and enhanced signal integrity.

This is an open access article under the <u>CC BY-SA</u> license.



#### **Corresponding Author:**

**Trang Hoang** 

Faculty of Electricals-Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), Vietnam National University Ho Chi Minh City (VNU-HCM) 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam Email: hoangtrang@hcmut.edu.vn

#### **INTRODUCTION** 1.

The rapid advancement of semiconductor technology has consistently presented challenges, particularly in complementary metal-oxide-semiconductor (CMOS) circuit design, where the demands for high-speed operation, low power consumption, and enhanced on-chip integration are paramount. Traditionally, one of the principal strategies to address these challenges has been to scale down transistor sizes. Although effective in enhancing circuit density and performance, this approach introduces a significant drawback: increased leakage current, especially as transistor sizes approach and go below 5 nm. This phenomenon results in persistent power consumption even when the devices are not actively operating.

In response to these challenges, a breakthrough in CMOS technology has been achieved with the development of the fin field effect transistor (FinFET) [1]. Unlike traditional planar metal-oxide-silicon fieldeffect transistors (MOSFETs), the FinFET design incorporates a three-dimensional fin-shaped gate that wraps around the conducting channel. This innovative structure not only mitigates channel-length modulation but also significantly reduces subthreshold leakage, providing more effective control over the channel.

Given its robust performance improvements, FinFET technology, which operates on principles similar to those of conventional MOSFETs, has become the preferred choice for addressing leakage in shortchannel transistors. It is progressively replacing traditional CMOS devices in applications requiring process nodes smaller than 22 nm [2]–[5]. This introduction sets the stage for a detailed discussion on the impact of FinFET technology on the evolution of high-speed, low-power CMOS circuit design, highlighting its growing importance in the field of semiconductor manufacturing.

The adoption of FinFET technology has expanded significantly, driven by its numerous advantages in achieving high-speed and low-power integrated circuit designs. This technology plays a crucial role in the development of high-speed input/output (HSIO) links, which are integral to a wide range of applications from smartphones and mobile-connected devices to various endpoints within the internet of things (IoT) ecosystem. These applications typically require multiple gigabits per second (Gbps) data rates, underscoring the need for advanced technology solutions like FinFET [6]–[11].

Despite its longstanding development, HSIO technology continues to encounter substantial challenges, particularly at the physical layer (PHY); the critical component of any interconnection solution. Effective PHY design is essential for mitigating typical noise impairments and addressing other non-idealities that commonly arise in transmitter-to-receiver interconnections. One notable implementation of PHY is the MIPI D-PHY, developed by the mobile industry processor interface (MIPI) alliance. As introduced in [12], MIPI D-PHY supports essential communication protocols such as the camera serial interface (CSI-2) and display serial interface (DSI), accommodating various channel lengths from short to long [13]–[16]. Initially, MIPI D-PHY versions 1.0 and 1.1 were limited to data rates of 1 Gbps/lane and 1.5 Gbps/lane, respectively, primarily due to low attenuation at lower frequencies [17]. However, subsequent revisions, starting from version 1.2, have pushed data rate capabilities up to 4.5 Gbps/lane. This enhancement was made possible through advanced design techniques involving equalization and skew calibration, which effectively compensate for channel insertion losses and skew variations among channels.

The challenges of channel loss, significantly impacting signal quality due to factors like skin effects, attenuation, dispersion, and reflections on printed circuit board (PCB) traces, remain a critical concern [18], [19]. Addressing these issues is essential for improving data recovery processes and overall system performance in high-speed data transmission environments. Efficient data recovery techniques are crucial for combating channel losses and distortions in high-speed communication systems. One notable approach is the use of continuous time linear equalization (CTLE), which enhances signal integrity by providing peaking gain at the Nyquist frequency [20]. Recent developments by MIPI have significantly increased chip data rates, from 1 Gbps/lane in version 1.0 to 4.5 Gbps/lane in version 2.1, introducing greater complexities in designing CTLE circuits that can handle these higher frequencies. This article extends beyond the current maximum capabilities stipulated by MIPI standards, addressing data rates as high as 7.2 Gbps/lane. At such elevated speeds, conventional CTLE architectures, particularly those at the SS (slow-slow) corner, face limitations in bandwidth and gain, insufficient for the demands of high-speed transmission [21]. These limitations often stem from the use of traditional design approaches that feature large CMOS sizes, high threshold voltages, and elevated supply voltages.

To address these challenges, this paper introduces a novel architecture that incorporates a folded cascode CTLE, optimized for small CMOS sizes, low threshold voltages, and reduced supply voltages. The architecture is comprised of a wideband amplifier and a folded cascode CTLE, as illustrated in Figure 1. The wideband amplifier is designed to deliver modest gain and bandwidth suitable for the design frequency, whereas the folded cascode CTLE employs an ultra-wideband configuration to operate at very high frequencies, ensuring high-speed performance and robust data recovery. Contribution of this paper: This paper addresses the issue of "stress on device", a critical challenge that arises when transitioning from high-speed mode (LP-1.8 V) to low power mode (HS-1.8 V) in semiconductor systems. This transition often leads to substantial stress on input/output (IO) devices, primarily due to the significant parasitic effects that constrain the bandwidth of continuous time linear equalization (CTLE). To mitigate this issue, our approach integrates a novel architecture comprising a wideband amplifier with minimal gain tailored to the IO device characteristics, alongside an ultra-wideband folded cascode CTLE designed for high-frequency core devices. This innovative configuration does not only reduce the size of the MOS but also significantly enhances the CTLE bandwidth, effectively addressing the parasitic limitations. The remainder of the paper is structured to facilitate a comprehensive understanding of our contributions. Section 2 details the proposed architecture and circuit design. Section 3 presents the simulation results, demonstrating the efficacy of our approach. Finally, section 4 concludes the paper with a summary of findings and potential implications for future research in the field.

#### 2. ARCHITECTURE

#### **2.1.** Conventional architecture

Figure 1 shows the conventional architecture and the proposed architecture for the receiver design including: i) the fully differential pair sensing stage (*Hsrx\_Diffbuff*), ii) the folded cascode continuous time linear equalization (*Hsrx\_CTLE*), iii) the single-ended two stage op-amp (*Hsrx\_Singbuff*), and iv) the cross-coupled inverter (*Hsrx\_Crosscoupled*).

In the conventional architecture given in the previous work of [12], the drain capacitor is very large because of the large IO devices size leading to limited bandwidth. Therefore, the proposed structure is based on the core devices which have a small size, low threshold voltage, and low voltage supply (VDDL=0.8 V at the typical corner). But the core device's protection could be a disadvantage in this proposed structure. For example, when the receiver switches from the high-speed mode (VDDL) to the low-power mode (VDDIO), the drain-source (DS) voltage and the gate-source (GS) voltage and the gate-drain (GD) voltage need to be less than or equal to VDDL. It means the core devices are only withstood VDDL voltage. Hence, the proposed design put the differential pair sensing stage used the I/O devices with wide bandwidth and low common-mode voltage before the folded cascode CTLE stage. This stage will be responsible for the safety of core devices. Meanwhile, the folded cascode CTLE pushes the speed.

Our design is based on a random frequency at input data pattern - Pseudorandom binary sequence (PRBS9), as shown in Figure 1 [12]. This data pattern will start in transmitter through the channel and lose the amount of dB depending on the length of the channel which will be compensated at the CTLE in the receiver. The proposed design has the differential input (INP/INN), the differential output after CTLE (OUT1P/OUT1N), and the differential output of the system (OUTP/OUTN).



Figure 1. The conventional and the proposed CTLE block diagram

#### 2.2. Hsrx\_Diffbuff

As we mentioned above, the purpose of this stage is to protect the *Hsrx\_CTLE* stage. This stage uses the I/O devices with VDDIO supply (high voltage). The design should have a wide bandwidth and low gain to make sure the signal in the channel only changes slightly. We are therefore using the differential pair sensing with the resistor load to achieve the high bandwidth as given in Figure 2. In high-speed mode, the signal has a low common-mode voltage (200 mV) and low swing (40 mV). With the low common-mode voltage data rate, the PMOS will be used in this stage.

#### 2.3. Hsrx\_CTLE

The conventional CTLE is designed with the differential MOS pair and controlled the peaking gain by *Rs* and *Cs*, as shown in Figure 3. The compensation process is implemented with many fuses of Rs depending on the loss on the channel. The PMOS is still used in this stage because of the low voltage common mode.



Figure 2. The differential pair sensing with resistor load



Figure 3. The conventional CTLE and AC response

The function of the conventional CTLE follows [15]:

$$H(s) = \frac{g_m}{c_p} \frac{s + \frac{1}{R_s C_s}}{\left(s + \frac{1 + g_m R_s/2}{R_s C_s}\right) \left(s + \frac{1}{R_d C_p}\right)}$$
(1)

One zero and two poles of the conventional CTLE are shown in (2).

$$\omega_{z} = \frac{1}{R_{s}C_{s}}, \, \omega_{p1} = \frac{1 + g_{m}R_{s}/2}{R_{s}C_{s}}, \, \omega_{p2} = \frac{1}{R_{d}C_{p}}$$
(2)

The DC gain, ideal peaking gain, and EQ level are shown (3).

$$DC_gain = \frac{g_m R_d}{1 + g_m R_s/2} \tag{3}$$

$$Peak\_gain = g_m R_d \tag{4}$$

$$EQ\_level = 1 + g_m R_s/2 \tag{5}$$

As shown in functions (3) and (4), there must be a trade-off between the gain and the bandwidth. Hence, the bandwidth in the basic CTLE is limited. The main cause is the large output capacitance leading to the limited  $\omega_{p2}$ . Besides the gain and the bandwidth, the offset voltage for the whole receiver is significantly concerned. The higher the gain, the larger the offset voltage is. The performance of the circuit is better with the low offset. Therefore, we need to design the offset cancellation.

The proposed CTLE is illustrated in Figure 4. This structure has three parts: EQ fuses to tune and compensate for the CTLE; The folded cascode CTLE; and the offset cancellation to minimize the output offset of CTLE. The function of the folded cascode CTLE follows:

$$H(s) = \frac{Kg_{m1}g_{m3}}{(C_p)^2} \frac{s + \frac{1}{R_s C_s}}{\left(s + \frac{1 + g_{m1}R_s/2}{R_s C_s}\right) \left(s + \frac{1}{R_d C_p}\right) \left[s + \frac{g_{m3}(1+K)}{C_p}\right]}$$
(6)

One zero and three poles of the folded cascode CTLE are shown in (7).

$$\omega_{z} = \frac{1}{R_{s}C_{s}}, \, \omega_{p1} = \frac{1 + g_{m1}R_{s}/2}{R_{s}C_{s}}, \, \omega_{p2} = \frac{1}{R_{d}C_{p}}, \, \omega_{p3} = \frac{g_{m3}(1+K)}{C_{p}}$$
(7)

The DC gain, ideal peaking gain, and EQ level are shown in (8).

$$DC_gain = \frac{Kg_{m1}R_d}{(1+g_{m1}R_S/2)(1+K)} \approx \frac{g_{m1}R_d}{1+g_{m1}R_S/2}$$
(8)

$$Peak\_gain = \frac{Kg_{m1}R_d}{1+K} \approx g_{m1}R_d \tag{9}$$

$$EQ_{level} = 1 + g_{m1}R_s/2 \tag{10}$$

The AC response is shown in Figure 5.



Figure 4. The proposed folded cascode CTLE



Figure 5. The AC response of the folded cascode CTLE

The proposed structure has many advantages as bellows: Firstly, the equations of the proposed CTLE is quite similar with the conventional CTLE. Secondly, the core of this structure, the output capacitance (CP) is smaller because the size of M3, M4 is smaller than M1, M2 (M3 and M4 do not exist in the conventional CTLE). Therefore, the bandwidth of the proposed CTLE is larger than the conventional CTLE. Last but not least, the conventional CTLE is used the IO device. Meanwhile, the proposed CTLE is used the core device. Therefore, the gain of the proposed CTLE is higher.

The operation of the Receiver is described as follows. Initially, the offset cancellation turns on and selects the RD fuse with minimal offset. And then, the CTLE selects the RS fuse to minimal jitter at the output receiver. This structure moves the common-mode voltage as in Figure 6. Another plus point of using the folded cascode CTLE is that we can use the diff-pair N-type metal–oxide–semiconductor (NMOS) in the next stage. The NMOS is a higher gain than PMOS because of the higher mobility.



Figure 6. The expected output common-mode voltage at the typical corner

#### 2.4. CTLE's receiver

With the traditional CTLE given in [22], the bandwidth and the gain are not enough leading to a tradeoff between gain and bandwidth [23]. With two stage CTLE of transmitter channel equalization (TCE) in [15], the  $2^{nd}$  amplifier is the traditional CTLE as in [22] help to reduce the voltage gain at the low frequency and the  $1^{st}$  amplifier like a buffer with wideband to increase the gain of the  $2^{nd}$  amplifier [15], [24]. In this study, proposed folded cascode CTLE is presented as Figure 7, in which, the  $2^{nd}$  amplifier is the folded cascode CTLE with smaller size help to increase the bandwidth compared to the traditional CTLE. Besides, using core device with VDDL in the  $2^{nd}$  amplifier will allow the circuit to operate at higher frequencies. In addition, the  $1^{st}$  amplifier with wideband helps to protect the  $2^{nd}$  amplifier and increase the gain of the  $2^{nd}$  amplifier.



Figure 7. Proposed CTLE

#### 2.5. Hsrx\_Singbuff

When the channel loss compensation is finished by CTLE, we need the high gain to push the signal up to full VDDL. The sing-buff stage is followed by a two-stage op-amp structure with the differential input and single-ended output as done in the previous study [16]. The design needs to guarantee that the size of M1, M2 is minimal and does not influence the bandwidth of CTLE while the gain is still enough. The total gain is followed by the function (11).

$$A_{v} \approx g_{m1} \cdot (r_{o1}//r_{o3}) \cdot g_{m5} \cdot (r_{o5}//r_{o7}) \tag{11}$$

With the input swing is 40 mV (Vdm = 40 mV) and the output can be achieved the "full swing" design, the gain design for this stage at min corner is 26 dB follows.

$$A_v \approx 20 \log \frac{800}{40} \approx 26 \, dB$$

#### 2.6. Hsrx\_Crosscoupled

Because of the high gain in the sing-buff stage, the single-ended output signal is unbalanced. The cross-coupled stage helps to adjust and recover the data signal to balance the jitter, as shown in Figure 8. The number of stages depends on power consumption and jitter, and in this study, two stages would be enough for our design.



Figure 8. The cross-coupled inverter

#### 3. EXPERIMENTAL RESULTS

#### 3.1. Simulation scenario

Our simulation results were implemented and simulated by a 6.1.7 Cadence Virtuoso environment. The 18 nm FinFET technology was used. The voltage and temperature references were followed the Table 1.

An input with 29-1 PRBS9 data pattern [5] was transmitted through the channel model to the receiver stage at 3.6 GHz clock frequency. The channel loss was measured at various frequencies: approximately 3.5 dB at 1.25 GHz, 8.3 dB at 3.6 GHz, and 10.8 dB at 5 GHz, as illustrated in Figure 9. These measurements highlight the channel's performance degradation at higher frequencies, which is a critical consideration in high-speed data transmission.

Simulations were rigorously conducted to assess the performance across process, voltage, and temperature (PVT) variations, encompassing five typical PVT corners: fast-fast (FF), fast-slow (FS), typical-typical (TT), slow-fast (SF), and slow-slow (SS). The comprehensive normal simulation regime incorporated a total of 33 conditions. This included variations across four corner MOS types (FF, FS, SF, SS), two supply voltage levels (VDDL/VDDH), two extreme temperature conditions (-40 °C and 105 °C), two common-mode voltage levels (70 mV and 330 mV), and the standard condition at the typical corner (TT).

In addition to the standard simulations, jitter performance was specifically evaluated to ensure robust ness under  $\pm 2$  mV offset voltage conditions, thereby extending the simulation to 66 corners. This extensive testing was designed to identify potential vulnerabilities under extreme operational scenarios. Results from these simulations of 66 corners were carefully analyzed, with findings for both the best and worst-case scenarios distinctly presented. For clarity and emphasis, results from the worst-case scenario were highlighted in bold within the summary table, aiding in the quick identification of critical performance bottlenecks.

| Table 1. The reference symbols for design summa | ary |
|-------------------------------------------------|-----|
|-------------------------------------------------|-----|

|             |          |     | 0        |
|-------------|----------|-----|----------|
| Note        | Min      | Тур | Max      |
| VDDIO       | 1.62 (L) | 1.8 | 1.98 (H) |
| VDDL        | 0.72 (L) | 0.8 | 0.88 (H) |
| Temperature | -40 (L)  | 25  | 125 (H)  |

#### 3.2. Hsrx\_Diffbuff

The *Hsrx\_Diffbuff* was a wide-bandwidth protection stage to avoid affecting the input data pattern. Figure 9 and Table 2 show the results of this stage. To achieve a speed of 7.2 Gbps/lane, the *Hsrx\_Diffbuff* must have a minimum bandwidth of 3.6 GHz. Indeed, the  $f_{-3dB}$  result of 6.45 GHz for minimum bandwidth is an appropriate value. Besides, due to the wide bandwidth, the gain will be smaller as shown in Table 2. Table 3 shown the comparison of proposed  $1^{st}$  amplifier and the TCE [15]'s  $1^{st}$  amplifier at worst case corner. The summary of CTLE was shown in Table 4. The best case of peak frequency was FF corner where devices were fast and the worst case of peak frequency was fast-fast-low-low-high (FFLLH) corner where devices were slow. Meanwhile, the peak gain was vice versa. This was the trade-off between gain and bandwidth. Table 5 shown the comparison of proposed CTLE and the TCE [8]'s CTLE at worst case corner. Because of smaller gain in the  $1^{st}$  amplifier, the gain of proposed CTLE is smaller than the gain of TCE [15]. But the peaking of the proposed CTLE is larger than 70% of the peaking frequency of TCE [15].



Figure 9. The input channel loss with AC response

Table 2. Simulation results of Hsrx Diffbuff

|            | DC gain | Bandwidth (-3 dB) |
|------------|---------|-------------------|
| Unit       | dB      | GHz               |
| Min        | 0.86    | 6.45              |
| Тур        | 2.92    | 11.17             |
| Max        | 3.80    | 19.39             |
| Min corner | FFHHL   | SSHHH             |
| Max corner | SSLLH   | FFHHL             |
|            |         | <b>a</b> 1        |

Note: F: fast, H: high, L: low, S: slow

Table 3. 1st amplifier performance comparison

|           | TCE [15] | This work |
|-----------|----------|-----------|
| DC gain   | 5.7 dB   | 0.86 dB   |
| Bandwidth | 2.5 GHz  | 6.45 GHz  |

#### Table 4. Simulation results of CTLE

| EQ level (fuse 5) | DC gain | Peaking gain | Peaking frequency |
|-------------------|---------|--------------|-------------------|
| Unit              | dB      | dB           | GHz               |
| Min               | -2.53   | 2.66         | 4.26              |
| Тур               | 4.45    | 10.68        | 7.58              |
| Max               | 9.82    | 17.77        | 9.88              |
| Min corner        | FFHHL   | FFHHL        | SSLLH             |
| Max corner        | SSLLH   | SSLLH        | FFLLL             |

Note: F: fast, H: high, L: low, S: slow

| TD 11 7   | OTTI | C           | •          |
|-----------|------|-------------|------------|
| Table 5   | CILE | performance | comparison |
| 1 4010 5. |      | periormanee | companioon |

|                     | TCE [15]           | This work (fuse 5) |
|---------------------|--------------------|--------------------|
| Peaking frequency   | 2.5 GHz            | 4.26 GHz           |
| DC gain             | 13 dB              | 9.82 dB            |
| Peaking gain        | 24 dB              | 17.77 dB           |
| EQ level at 2.5 GHz | 11 dB              | 7.95 dB (*)        |
| (*) 12.95 dB wit    | h fuse 10 (+1 dB y | with each fuse)    |

## 4964 🗖

#### 3.3. Hsrx\_Singbuff

Figure 10 and Table 6 show the simulation results of the *Hsrx\_Singbuff* stage. This stage needed a high gain and a medium bandwidth. With the 40 mV input swing, we need 26 dB to gain the input signal up to VDDL (800 mV at the typical corner) as in (12).

$$A_{\nu} \approx 20 \log \frac{800}{40} \approx 26 dB \tag{12}$$

In the *Hsrx\_Singbuff* simulation, the minimum gain was 27.55 dB and the bandwidth was 0.72 GHz. Compared to the expectation, the minimum gain was still larger than 26 dB. Therefore, the simulation of this stage had met the expectations.



Figure 10. The AC response of the Hsrx\_Singbuff (SSHHH corner)

| Table | 6. The Hsrz      | x_Singbu    | ff results in summary |
|-------|------------------|-------------|-----------------------|
|       |                  | DC gain     | Bandwidth (-3dB)      |
|       | Unit             | dB          | GHz                   |
|       | Min              | 27.55       | 0.72                  |
|       | Тур              | 30.49       | 0.90                  |
|       | Max              | 32.19       | 1.08                  |
|       | Min corner       | FFLLH       | SSLLH                 |
| -     | Max corner       | SSHHL       | FFLLH                 |
| _     | Note: F: fast, I | H: high, L: | low, S: slow          |

#### 3.4. Offset cancellation technique

Table 7 shows the offset need to cover in CTLE. The results are measured at the worst-case corner (SSLLH)–global corner and local mismatch. In this design, 73,364 mV is the maximum offset run with Monte Carlo simulation of  $4.5\sigma$ . The technique to reduce this offset is given as follows. The offset could be reduced when the code of (OUT1P, OUT1N) are swept from (5'b11111, 5'b00000) to (5'b00000, 5'b1111). The covered offset shown below:

 $\begin{aligned} Offset &= I_{CTLE}. R_{oc}. \left( Scale_n - Scale_p \right) \\ Max_Offset_Covered &= I_{CTLE}. R_{oc}. \left( Scale_{n5'b00000} - Scale_{p5'b11111} \right) \\ Min_Offset_Covered &= I_{CTLE}. R_{oc}. \left( Scale_{n5'b01111} - Scale_{p5'b10001} \right) \end{aligned}$ 

where I<sub>CTLE</sub> is the current flow in each branch of CTLE and R<sub>oc</sub> is offset cancellation resister

$$Scale = \frac{1}{\frac{1}{\frac{1}{R_{oc}} + \frac{bit_0}{2^0 R_{oc}} + \frac{bit_1}{2^1 R_{oc}} + \frac{bit_2}{2^2 R_{oc}} + \dots + \frac{bit_n}{2^n R_{oc}}}$$

For example, the covered offset shown below:

 $Max_Offset_Covered = 400u \cdot 321.5 \cdot (1 - 0.34) = 82.4 mV$  $Min_Offset_Covered = 400u \cdot 321.5 \cdot (0.52 - 0.5) = 2 mV$  The middle code (OUT1P, OUT1N) is (5'b10000, 5'b10000) where offset is about 0 mV. The maximum offset could be covered of  $\pm 82.4$  mV. And the margin of offset need to simulate is about  $\pm 2$  mV.

Table 7. Offset with Monte Carlo simulation  $(4.5\sigma)$ .

| Monte Carlo simulation (4.5 $\sigma$ ) - Offset=OUTN-OUTP |                     |                       |                       |  |  |
|-----------------------------------------------------------|---------------------|-----------------------|-----------------------|--|--|
| Mean                                                      | Std Dev( $\sigma$ ) | Min=Mean-4.5 $\sigma$ | Max=Mean+4.5 $\sigma$ |  |  |
| 0                                                         | 16.303              | -73.364               | 73.364                |  |  |
|                                                           |                     |                       |                       |  |  |

#### 3.5. Output Eye diagram

To evaluate the performance of the receiver design, the EyeHeight, the EyeWidth and the Jitter of the output eye diagram were one of the most important parameters [12]. The less the jitter, the better the performance was. Table 6, Table 7, and Figure 11 showed the output eye diagram summary of receiver design. The jitter was measured with  $\pm 2$  mV offset at node OUT1P and OUT1N as shown in Figure 1, when the offset calibration finished. The worst case of EyeHeight, EyeWidth and Jitter were at SSLLH corner as Tables 8 and 9 respectively. The minimum of EyeHeight was 720 mV, the minimum of EyeWidth was 119.26 ps, and the maximum of Jitter was 19.63 ps (at the output of the system–OUTP/OUTN).

Figure 11 shown the output eye diagram at typical corner. The common mode at INP node, OUT1P node, and OUTP node were  $V_{cm_{INP}} = 200 \text{ mV}$ ,  $V_{cm_{OUT1P}} = 640 \text{ mV}$ , and  $V_{cm_{OUTP}} = 400 \text{ mV}$ . These common modes met with the expected common mode in Figure 6. Figure 11 also shows the output eye diagram at worst case corner (SSLLH). The input at this worst case (INP) had  $EyeHeight_{INP} = 40 \text{ mV}$ ,  $EyeWidth_{INP} = 94 \text{ ps}$ , and  $Jitter_{INP} = 45 \text{ ps}$ . At the output of CTLE (OUT1P), the EyeHeight and EyeWidth were opened to  $EyeHeight_{OUT1P} = 324.36 \text{ mV}$  and  $EyeWidth_{OUT1P} = 121.24 \text{ ps}$ . Meanwhile, the jitter reduced to Jitter\_{OUT1P} = 17.65 \text{ ps}. Finally, the EyeHeight was opened to full swing  $EyeHeight_{OUTP} = 720 \text{ mV}$  and the Jitter increased slightly as  $Jitter_{OUTP} = 19.63 \text{ ps}$  at the output of the system (OUTP).



Figure 11. The output eye diagram at typical and worst-case corner

| The EyeHeight |         |        |       |       |            | The EyeV | Vidth  |        |        |
|---------------|---------|--------|-------|-------|------------|----------|--------|--------|--------|
| Jitter of     | INP     | OUT1P  | OUTP  | OUTN  | Jitter of  | INP      | OUT1P  | OUTP   | OUTN   |
| Unit          | mV      | mV     | mV    | mV    | Unit       | ps       | ps     | ps     | ps     |
| Min           | 40      | 85.44  | 720   | 720   | Min        | 94       | 121.24 | 120.85 | 119.26 |
| Тур           | 40      | 203.81 | 800   | 800   | Тур        | 94       | 126.51 | 125.63 | 124.01 |
| Max           | 40      | 324.36 | 880   | 880   | Max        | 94       | 131.62 | 129.01 | 128.51 |
| Min corner    | All pvt | FFLLL  | SSLLH | SSLLH | Min corner | All pvt  | SSLLH  | SSLLH  | SSLLH  |
| Max corner    | All pvt | SSLLH  | SSHHH | SSHHH | Max corner | All pvt  | FFLLL  | FFLLL  | FFLLL  |

Table 8. The summary of output EyeHeight, with  $\pm 2mV$  offset calibration

Table 9. The summary of output jitter with  $\pm 2 \text{ mV}$  offset calibration

| Jitter of  | INP     | OUT1P | OUTP  | OUTN  |
|------------|---------|-------|-------|-------|
| Unit       | ps      | ps    | ps    | ps    |
| Min        | 45      | 7.27  | 9.88  | 10.38 |
| Тур        | 45      | 12.38 | 13.26 | 14.88 |
| Max        | 45      | 17.65 | 18.04 | 19.63 |
| Min corner | All pvt | FFLLL | FFLLL | FFLLL |
| Max corner | All pvt | SSLLH | SSLLH | SSLLH |

As shown in Table 4, the maximum peaking frequency of the CTLE was 9.88 GHz at the FFLLL corner, while the minimum peaking frequency was 4.26 GHz at the SSLLH corner. Consequently, the best case of jitter was 9.88 ps at the FFLLL corner. Conversely, the worst case of jitter was 19.63 ps at the SSLLH corner.

Figure 12 shown the effect of the offset on the output CTLE design. The big offset could be created the mismatch at the output of CTLE. Figure 12 shows the OUTP's output eye diagram at worst case corner (SSLLH) simulated with  $\pm 2 \text{ mV}$  offset,  $\pm 48.909 \text{ mV}$  offset ( $\pm 3\sigma$ ) and  $\pm 73.364 \text{ mV}$  offset ( $\pm 4.5\sigma$ ), with  $\pm 3\sigma$  offset: *EyeHeight*<sub>OUTP</sub> = 690 mV, *EyeWidth*<sub>OUTP</sub> = 121.24 ps, *Jitter*<sub>OUTP</sub> = 54.88 ps, and with  $\pm 4.5\sigma$  offset: *OUTP* = *VSS* = 0 mV, *OUTN* = *VDDL* = 720 mV. Therefore, the *EyeHeight*, *EyeWidth* and *Jitter* is worse with the in-creasing of offset. The data was mismatched when the offset was  $\pm 4.5\sigma$ .

The eye diagram results are shown in Tables 10 and 11. Table 10 shows the main characteristics of the 7.2 Gbps/lane (3.6 GHz) receiver based on the measurement results compared with MIPI D-PHY specifications. Table 11 gives the comparison of proposed receiver with the other works.

Compared to MIPI D-PHY characteristics in Table 8, the MIPI D-PHY version 2.1 was designed for the receiver that runs up to 4.5 Gbps and required a jitter per UI of 10% (i.e. about 22.22 ps jitter of data pattern) for the entire system [12]. The proposed structure was implemented and designed with: i) 7.2 Gbps/lane of data rate (60% faster than MIPI's speed); and ii) 19.63 ps jitter of data pattern at the worst-case corner (14% of 1 UI at 3.6 Ghz). The remaining "4%" would be supported by the remaining blocks such as clock data recovery (CDR), and digital signal processing (DSP).

Compared to the other transceivers as in Table 9, the proposed structure was implemented and designed with:

- 7.2 Gbps/lane of data rate (20% faster than 6 Gbps of A-SSCC [25])
- 0.14UI of jitter (44% smaller than 0.25UI of TCE [26])
- 0.86UI of Eyewidth (14% larger than 0.75UI of TCE [26])
- 0.85 mV/Gb/s of power per lane (24% smaller than 1.1UI of TCE [26]).

| Table 10. The main characteristics of the proposed receiver |              |      |                      |           |
|-------------------------------------------------------------|--------------|------|----------------------|-----------|
| Items                                                       |              |      | MIPI D-PHY v2.1 [12] | This work |
| Maximum data rate                                           |              | Gbps | 4.5                  | 7.2       |
| Unit Interval (UI)                                          |              | ps   | 222.22               | 138.88    |
| Minimum receivable input signal                             | EyeWidth     | UI   | 0.5                  | 0.67      |
| Minimum receivable input signal                             | EyeHeight    | mV   | 80                   | 40        |
|                                                             | Channel loss | dB   | -3.25                | -3.50     |
|                                                             | Frequency    | GHz  | 1.25                 | 1.25      |
| Maximum differential insertion loss of channel              | Channel loss | dB   | -                    | -8.30     |
|                                                             | Frequency    | GHz  | 3.60                 | 3.60      |
|                                                             | Channel loss | dB   | -11.10               | -10.80    |
|                                                             | Frequency    | GHz  | 5.00                 | 5.00      |
| Jitter of Clock Random                                      |              | UI   | 0.10                 | 0.14      |

Table 10. The main characteristics of the proposed receiver

ISSN: 2088-8708

Table 11. Performance summary and comparison of proposed transceiver

|                             |                    |                    | T T T T T T T T T T T T T T T |                 |
|-----------------------------|--------------------|--------------------|-------------------------------|-----------------|
| Reference                   | TCE [15] - 11/2019 | ITC [26] - 09/2022 | A-SSCC [25] - 12/2022         | This work       |
| Application                 | MIPI D PHY v2.0    | MIPI D PHY         | MIPI C/D PHY                  | MIPI D PHY v2.1 |
| Technology                  | 110 nm CMOS        | -                  | 110 nm CMOS                   | 18 nm FinFET    |
| Supply                      | 1.2 V              | 1.8 V              | 1.5 V                         | 1.8 V and 0.8V  |
| Operation mode              | Receiver           | Receiver           | Receiver                      | Receiver        |
| PRBS                        | -                  | -                  | 7                             | 9               |
| Data rate/lane              | 5 Gbps             | 4.5 Gbps           | 6G bps                        | 7.2 Gbps        |
| Unit interval (UI)          | 200 ps             | 222 ps             | 167 ps                        | 139 ps          |
| Eye opening of data pattern | 0.75UI             | 0.72UI             | 0.74UI                        | 0.86UI          |
| Jitter of data pattern      | 0.25UI             | 0.28UI             | 0.26UI                        | 0.14UI          |
| Power/lane (mW/Gb/s)        | 1.12               | -                  | -                             | 0.85            |



Figure 12. Effect of the Offset on CTLE design

#### 4. CONCLUSION

In this paper, a high-performance receiver design capable of achieving 7.2 Gbps per lane for the D-PHY architecture, incorporating a folded cascode CTLE were introduced. The design successfully attained a peaking frequency of 4.26 GHz and a peaking gain of 17.77 dB at the CTLE, exceeding the expected performance metrics by a significant margin of 0.66 GHz against a target of 3.6 GHz. Furthermore, our design demonstrated a maximum jitter of 19.63 picoseconds under  $\pm 2$  mV offset voltage conditions, while maintaining an efficient power consumption profile of 6.1 mW, translating to 0.85 mW/Gb/s.

Looking ahead, future work will expand on the current CTLE topology and the overall receiver system for the MIPI D-PHY Interface by exploring the integration of artificial intelligence algorithms. These algorithms will be employed to optimize transistor sizing, aiming to enhance performance further. Upcoming studies will involve a comprehensive comparison and analysis of the performance metrics between the exist ing designs and those augmented by AI-driven optimizations. This approach is expected to yield significant insights into the scalability and efficiency enhancements possible in high-speed digital receiver design.

#### ACKNOWLEDGEMENTS

We acknowledge Ho Chi Minh City University of Technology (HCMUT), VNU-HCM for supporting this study.

#### REFERENCES

- W. P. Maszara and M.-R. Lin, "FinFETs technology and circuit design challenges," 2013 Proceedings of the ESSCIRC (ESSCIRC), Sep. 2013, doi: 10.1109/esscirc.2013.6649058.
- [2] S. Verma, S. L. Tripathi, and M. Bassi, "Performance analysis of FinFET device using qualitative approach for low-power applications," in *Proceedings of 3rd International Conference on 2019 Devices for Integrated Circuit, DevIC 2019*, Mar. 2019, pp. 84–88, doi: 10.1109/DEVIC.2019.8783754.
- [3] Y. Yang, J. Park, S. C. Song, J. Wang, G. Yeap, and S. O. Jung, "Single-ended 9T SRAM cell for near-threshold voltage operation with enhanced read performance in 22-nm FinFET technology," *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, vol. 23, no. 11, pp. 2748–2752, Nov. 2015, doi: 10.1109/TVLSI.2014.2367234.
- H. Xie et al., "A novel 1T-DRAM fabricated with 22 nm FD-SOI technology," *IEEE Electron Device Letters*, vol. 45, no. 4, pp. 558–561, Apr. 2024, doi: 10.1109/LED.2024.3368522.
- [5] N. Elsayed, S. Makhsuci, and M. Sanduleanu, "A 28GHz, switched-cascode, class e amplifier in 22nm CMOS FDSOI technology," *IEEE Journal of Microwaves*, vol. 4, no. 2, pp. 246–252, Apr. 2024, doi: 10.1109/JMW.2024.3358627.
- [6] F. E. Rangel-Patiño, J. E. Rayas-Sánchez, A. Viveros-Wacher, E. A. Vega-Ochoa, and N. Hakim, "High-speed links receiver optimization in post-silicon validation exploiting broyden-based input space mapping," in 2018 IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), Aug. 2018, pp. 1–3, doi: 10.1109/NEMO.2018.8503099.
- [7] F. E. Rangel-Patino, J. E. Rayas-Sanchez, and N. Hakim, "Transmitter and receiver equalizers optimization methodologies for high-speed links in industrial computer platforms post-silicon validation," 2018 IEEE International Test Conference (ITC), Oct. 2018, doi: 10.1109/test.2018.8624794.
- [8] H. D. Sehgal, Y. Pratap, and S. Kabra, "Designing and reliability analysis of radiation hardened stacked gate junctionless FinFET and CMOS inverter," *IEEE Transactions on Device and Materials Reliability*, vol. 23, no. 2, pp. 249–256, Jun. 2023, doi: 10.1109/tdmr.2023.3255407.
- [9] P. Zheng, D. Connelly, F. Ding, and T.-J. K. Liu, "FinFET evolution toward stacked-nanowire FET for CMOS technology scaling," *IEEE Transactions on Electron Devices*, vol. 62, no. 12, pp. 3945–3950, Dec. 2015, doi: 10.1109/ted.2015.2487367.
- [10] A. Elwailly, J. Saltin, M. J. Gadlage, and H. Y. Wong, "Radiation hardness study of LG = 20 nm FinFET and nanowire SRAM through TCAD simulation," *IEEE Transactions on Electron Devices*, vol. 68, no. 5, pp. 2289–2294, May 2021, doi: 10.1109/ted.2021.3067855.
- [11] M. Son, J. Sung, H. W. Baac, and C. Shin, "Comparative study of novel u-Shaped SOI FinFET against multiple-Fin Bulk/SOI FinFET," *IEEE Access*, vol. 11, pp. 96170–96176, 2023, doi: 10.1109/access.2023.3308592.
- [12] MIPI Alliance, "MIPI Alliance specification for D-PHY," *mipi.org*. https://www.mipi.org/specifications/d-phy (accessed Dec. 06, 2023).
- [13] P.-H. Lee and Y.-C. Jang, "A 6.84 Gbps/lane MIPI C-PHY transceiver bridge chip with level-dependent equalization," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 67, no. 11, pp. 2672–2676, Nov. 2020, doi: 10.1109/tcsii.2019.2962839.
- [14] T. Kim et al., "A 14-Gb/s dual-mode receiver with MIPI D-PHY and C-PHY interfaces for mobile display drivers," Journal of the Society for Information Display, vol. 28, no. 6, pp. 535–547, May 2020, doi: 10.1002/jsid.916.
- [15] P.-H. Lee and Y.-C. Jang, "A 20-Gb/s receiver bridge chip with auto-skew calibration for MIPI D-PHY interface," *IEEE Transactions on Consumer Electronics*, vol. 65, no. 4, pp. 484–492, Nov. 2019, doi: 10.1109/tce.2019.2942503.
- [16] P.-H. Lee, H.-Y. Lee, Y.-W. Kim, H.-Y. Hong, and Y.-C. Jang, "A 10-Gbps receiver bridge chip with deserializer for FPGAbased frame grabber supporting MIPI CSI-2," *IEEE Transactions on Consumer Electronics*, vol. 63, no. 3, pp. 209–215, Aug. 2017, doi: 10.1109/tce.2017.014908.
- [17] E.-J. Kim, H.-Y. Park, C. Ahn, S.-I. Lim, and S. Kim, "Unified dual mode physical layer for mobile CMOS image sensor interface," *IEEE Transactions on Consumer Electronics*, vol. 56, no. 3, pp. 1196–1203, Aug. 2010, doi: 10.1109/tce.2010.5606246.
- [18] Y. Choi and Y.-B. Kim, "A 10-Gb/s receiver with a continuous-time linear equalizer and 1-tap decision-feedback equalizer," 2015 IEEE 58th International Midwest Symposium on Circuits and Systems (MWSCAS), Fort Collins, CO, USA, 2015, pp. 1-4, doi: 10.1109/mwscas.2015.7282072.
- [19] S. Wu, Q. Wang, N. Ning, and J. Li, "An inductive peaking technology for high-speed MIPI receiver bandwidth expanding in a 90 nm CMOS process," 2016 IEEE International Nanoelectronics Conference (INEC), Chengdu, China, 2016, pp. 1-2, doi: 10.1109/inec.2016.7589399.
- [20] W. T. Beyene, "The design of continuous-time linear equalizers using model order reduction techniques," in 2008 IEEE-EPEP Electrical Performance of Electronic Packaging, Oct. 2008, pp. 187–190, doi: 10.1109/EPEP.2008.4675910.
- [21] G. T. Manvel, A. A. Arman, H. H. Garik, and H. S. Sergo, "Two stage CTLE for high speed data receiving," 2020 IEEE 40th International Conference on Electronics and Nanotechnology (ELNANO), Kyiv, Ukraine, 2020, pp. 374-377, doi: 10.1109/elnano50318.2020.9088865.
- [22] S. Gondi and B. Razavi, "Equalization and clock and data recovery techniques for 10-Gb/s CMOS serial-link receivers," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 9, pp. 1999–2011, Sep. 2007, doi: 10.1109/jssc.2007.903076.
- [23] B. Razavi, Design of analog CMOS integrated circuits, 2nd Ed. New York: McGraw-Hill Education, 2015.
- [24] W. Kim and M. Lee, "A 92-μW/Gbps self-biased SLVS receiver for MIPI D-PHY applications," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 68, no. 10, pp. 3219–3223, Oct. 2021, doi: 10.1109/TCSII.2021.3074675.
- [25] J. Bae, M. Song, B. Kim, J. Lee, W. Park, and J.-H. Chun, "A 11.4-Gbps/lane MIPI 32-bit C-PHY and D-PHY combo transmitter with 3-tap FFE," 2022 IEEE Asian Solid-State Circuits Conference (A-SSCC), Taipei, Taiwan, 2022, pp. 1-3, doi: 10.1109/asscc56115.2022.9980792.
- [26] S. Lee et al., "4.5 Gsps MIPI D-PHY receiver circuit for automatic test equipment," 2022 IEEE International Test Conference (ITC), Sep. 2022, doi: 10.1109/itc50671.2022.00073.

### **BIOGRAPHIES OF AUTHORS**



**Trang Hoang** <sup>(b)</sup> S <sup>(c)</sup> was born in Nha Trang city, Vietnam. He received the bachelor of engineering, and master of science degree in electronics-telecommunication engineering from HCMUT in 2002 and 2004, respectively. He received the Ph.D. degree in microelectronics MEMS from CEA-LETI and University Joseph Fourier, France, in 2009. From 2009–2010, he did the post doctorate research in Orange Lab-France Telecom. Since 2010, he is lecturer at Faculty of Electricals–Electronics Engineering, HCMUT. His field of research interest is in the domain of ASIC-FPGA implementation, IC architecture, micro-fabrication, wireless communication, quantum computing, optimization in analog IC design. He can be contacted at email: hoangtrang@hcmut.edu.vn.



Anh Nam Ha **b** SI **s** b was born in Vung Tau City, Vietnam. He received bachelor and master degrees in electronics engineering from Ho Chi Minh City University of Technology (HCMUT)-VNU in 2019 and 2022, respectively. His research interests include the design of high-efficiency charge pump circuits, analog switch circuits, high-precision comparator circuits, ADC/DAC design and high-speed interface. He can be contacted at email: hnanh.sdh19@hcmut.edu.vn.