TFI-FTS: An efficient transient fault injection and fault-tolerant system for asynchronous circuits on FPGA platform

Designing VLSI digital circuits is challenging tasks because of testing the circuits concerning design time. The reliability and productivity of digital integrated circuits are primarily affected by the defects in the manufacturing process or systems. If the defects are more in the systems, which leads the fault in the systems. The fault tolerant systems are necessary to overcome the faults in the VLSI digital circuits. In this research article, an asynchronous circuits based an effective transient fault injection (TFI) and fault tolerant system (FTS) are modelled. The TFI system generates the faults based on BMA based LFSR with faulty logic insertion and one hot encoded register. The BMA based LFSR reduces the hardware complexity with less power consumption on-chip than standard LFSR method. The FTS uses triple mode redundancy (TMR) based majority voter logic (MVL) to tolerant the faults for asynchronous circuits. The benchmarked 74X-series circuits are considered as an asynchronous circuit for TMR logic. The TFI-FTS module is modeled using Verilog-HDL on Xilinx-ISE and synthesized on hardware platform. The Performance parameters are tabulated for TFI-FTS based asynchronous circuits. The performance of TFI-FTS Module is analyzed with 100% fault coverage. The fault coverage is validated using functional simulation of each asynchronous circuit with fault injection in TFI-FTS Module.


INTRODUCTION
The VLSI circuits are defect while manufacturing because of external resource problems, disturbances, and component misplaces, which leads the fault and error in the circuits. These errors are causing various severe concerns because of the downscaling of device features and lowering of supply voltage [1]. The external radiations are increased due to certain defects during the manufacturing process because of the abnormally large gate area, thin wires, and dust particles [2]. The advanced manufacturing process lead to new kind of defects, and these defects may have an indirect impact on transient errors susceptibility [3]. For example, the immersion lithography is the real used technique to get better resolutions for smaller features as the water is having a refractive index higher than air [4,5]. The air bubble in water generates the defects which lead to light refraction, which also cause an increment in features size as well. Thus, it is quite a tough task to analyze these transient errors are not consistently present [6]. The faults in the VLSI design may not coincide with the times as the test vectors are used to detect them [7]. The interpretation of the circuit's results was a challenging task when the transient error was detected [8]. The detection can give an anomalous event, or it may represent a regular pattern. Thus, fault-tolerant circuits are needed in various application  [6]. In that regards, various possible approaches were tried to provide the solution or analytical modeling, and its applicability and accuracy are mainly failed in complex VLSI circuits due to improper fault injection techniques and time management for fault injection to the fault tolerant system (FTS) [1][2][3][4][5][6][7]. The reliable system under test (SUT) is processed in a fault injection system. The fault injection involves inserting faults as a module and checks the progress for determining the behavior in response to a fault [9]. There are various fault injection techniques were presented in the recent past and are grouped as hardware, software, emulation simulation, and hybrid-based fault injection system [1][2][3][4][5][6][7][8][9][10]. Also, the fault injection techniques have been considered for a long time as necessary to check the reliability of a system by analyzing the behavior of the devices when a fault occurs [11,12].
In this research article, the issues of reliability and productivity of digital IC's affected due to the manufacturing process are considered and aimed to overcome the transient faults in asynchronous circuits and future with synchronous circuits. Section 1 discusses the background of the previous research works of transient fault injection and fault-tolerant systems for digital circuits with different applications, followed by research gaps. Section 2 describes the transient Fault injection and fault tolerant system (TFI-FTS) for asynchronous circuits with detailed hardware architecture. Section 3 explains the results of TFI-FTS for asynchronous circuits by analyzing the hardware constraints and also fault injection with fault coverage for benchmarked combinational circuits. Section 4 concludes the overall work with improvements and future work.
This section discusses the existing work of fault injection and fault-tolerant systems for digital circuits. The work of Zhang et al. [13] introduced soft error detection (AUDITOR) scheme for flip-flop based pipelines. This scheme supports soft errors like a single event upset (SEU) and transient (SET). The outcomes of AUDITOR suggests that it has got the capability of robust detection and short latency with reduced area and power of 29% and 50% respectively. The fabrication mechanism of improved to the nano levels and hence, the systems are needed to have higher susceptibility towards soft errors. Hence, Sheikh et al. [14] had designed integrated circuits for minimum area overhead and soft error tolerance. The outcomes analysis suggest that the system has achieved significant reliability than another transistor sizing-based mechanism. Kwak et al. [15] presented the simple corrective control model in asynchronous sequential circuits for tolerating transition faults. The controller is static without memory element and solves the fault tolerant problems with deterministic operations. The simulation results are encouraging to validate the process. A further enhancement was processed by Yang [16] to improve the reachability and recover the input and state transition of the corrective controller within a bounded delay. Somasundaram et al. [17] presented a self-checking module for VLSI circuits using error detection coding (EDC). The error detection technique (EDC) is scalable, execute the circuits in less time, and support low latency mechanism. The EDC is a suitable method to detect the faults, and it is tolerant of the faults, which improves the hardware resource constraints and performance overhead and detect all types of multiple errors which are in unidirectional. The self-checking module works based on the design circuit output and EDC outputs and generates the fault status of the module. The module is a unidirectional and self-checking process, and it does not affect the performance of the systems. These modules are designed using 45nm ASIC technology and challenging to do hardware implementation with FPGA prototyping.
The hardware-based controllers were designed by Alkady et al. [18], which supported network systems for factory automation. The controllers are fault tolerant and designed using in-loop and node based architectures. The network is framed using star topology along with Ethernet switch. The in-loop contains actuators, controller, and sensors. The fault tolerant is achieved using Markov modeling in controller with state machines. The system reliability is improved over time, which indicates the faults are recovered within the controller. Anjankar et al. [19] presented the fault-tolerant and recovery technique (FT-RT) for Arthematic module using TMR method. The SEU error is introduced as a fault to the adder module, and The TMR recover the fault with the correct output. The fault and error detection is achieved only for one bit, but it consumes more latency and with huge combinational delay. However, these technique lags with detection of faults for all the given inputs of the circuit. Macian et al. [20] presented the bloom filter design, which is tolerable from the single-event-transient (SET) in networking applications. The SET system for bloom filter is designed based on the pattern and parity-based re-executions. The design is evaluated for hash function under simulation and theoretical results. The area and delay synthesis results improve the performance of the system.
Qingkun et al. [21] presented the hardware architecture of heart rate monitoring system, which is fault-tolerant and robust. The heart rate monitoring have peak detection unit, estimation and fusion unit. The fault tolerant system is incorporated in functional unit of main hardware. The overhead of the hardware constraints are discussed with improvements. The transient and permanent faults were addressed by Feng et al. [22] for NoC based router design. The router uses fault tolerant and deflection based routing algorithm to process the permanent faults. The hybrid automatic repeat-request (ARQ) and forward error correction (FEC) are processed to handle the transient faults. The hardware results shows that both the fault recovery methods  [23] presented the modified error and detection mechanism with transient faults. The hamming codes are modified in both encoder and decoder. The modified codes support different data width which will reside in NoC switch. The receiver network interface (NI) contains the encoder data and pass to a router for communication with another router while communicating any fault occurs in a router, transmitter NI process with decoder with the error detection process. The applications of FTS in different fields like mobile networks [24] and robotics [25] are elaborated. From the review of recent literature and previous designs, it has been noticed that the amount of work carried on transient fault injection and fault tolerant system for digital circuits is based on software approaches and few on hardware-based approaches with backend designs using Cadence, spice tools, etc. In the available existing hardware-based approaches carried most of them with conventional methods. These existing algorithms are facing hardware complexity, performance degradation, and more chip area consumption with massive power consumption in the digital circuit's fault tolerant system. The existing system lags with productivity and reliability to operate the faults in the system, and it affects the system performance. Operating the transient faults in systems is a challenging task in the digital environment. Thus there is a need for "cost-effective transient fault injection and tolerant system for asynchronous circuits".

PROPOSED WORK
The proposed TFI-FTS module for asynchronous circuits are explained with its hardware architectures. The digital circuits are categories into combinational and sequential logic circuits. The combination circuits deal with, the output port signals depends only on present input port signals, and in the sequential circuits, the output port signals depend both on present and past input port signals. The sequential logic circuits are divided with synchronous and asynchronous logic circuits. In general, the asynchronous circuit is a sequential digital circuit and which is not dominated by the clock signal or clock circuit. These circuit uses control signals, which leads with logical operations and completion of instructions. The asynchronous circuits are faster, consume low power and excellent flexibility in a bigger system than sequential circuits. The proposed design supports both combinational and sequential asynchronous circuits. In this work, for fault tolerant system design, the asynchronous circuits are considered with combination logic.
The overview of the TFI-FTS module is represented in Figure 1 for asynchronous circuits. In fault injection system, the generation of the random number patterns are achieved using Berlekamp Massey algorithm based LFSR's, and are fed to fault injection logic. The control unit is set by the user and control the fault injection logic output either to 0 or 1. If the output is 1, the data output is changed according to the onehot encoder based shift register and inject the fault to the tolerant system. If the output is 0, the fault is not injected to the below system. In design, Transient faults are injected into the fault tolerant systems to detect the asynchronous circuits is faulty or not.  The FTS module is designed with the help of triple-modular-redundancy. The asynchronous circuits are referred from standard benchmark circuits and adopted in fault-tolerant systems to inject the fault. The asynchronous combinational 74X-Series circuits from the benchmark circuits are used to inject the faults at input data port. The same asynchronous circuit is called three times. In that one is fault input feed from injection systems, and the other two are standard inputs. According to the functionality of the circuits, the three outputs are fed to the majority voter module. The majority voter works according to the majority, two out of three circuits, gives the correct output; otherwise, the system is a fault.

Transient fault injection system
The hardware architecture of the TFI system is designed using BMA based LFSR is in Figure 2. It mainly has BMA based LFSR's, XOR Module and control unit, and one hot encoded register along with data register. The Berlekamp Massey algorithm based LFSR is used to reduce the chip utilization on hardware and power consumption than standard LFSR. The standard LFSR processed with long sequence and testing the long sequences consumes more time and chip area. The BMA based LFSR resolved the problem and used to execute the shortest LFSR, which is capable of processing the same sequences. The BMA based LFSR generates the random sequences in a particular time interval, and these are necessary for the fault injection process. The 4-bit two LFSR is designed with random sequences according to the BMA and executed in parallel. The BMA is used to calculate the linear complexity of a binary sequence with a specified length. The fault injection system is more flexible with less chip area and the same testing length sequence by using BMA based LFSR. The connection polynomial decides the feedback sequence of the LFSR in BMA. The linear complexity of the LFSR decides the length of the LFSR. These two random sequences are fed to faulty logic. The two 4-bit random sequences are XOR'ed and generates the control signal using the control unit. The logic '1' or logic '0' as per user interest is decided by control unit. If the logic '1' is set, the 4-bit data is processed in a data register using one hot shift register. If the logic '0' is set, the user data is processed in the data register. The one-hot encoded shift register is designed using data flip-flops and which hold the one active bit and rest zero bits are stored in the D-FF. When the control logic '1' is activated, the data flipping is done using one-hot encoded output EX-OR with data register output and this leads faults in data bits. The logic '1' indicates the faults are generated and inject to the tolerant system. If logic '0' indicates, no faults are generated except user data. These faults are injected in TMR based fault tolerant system for asynchronous circuits to validate the fault coverage and hardware constraints.

Fault tolerant system
The transient faults are injected to the user inputs in asynchronous circuits. The benchmarked asynchronous circuits with combinational logic [26] are chosen for a fault tolerant system, and its hardware architecture is represented in Figure 3. The benchmarked 74X series circuits are used as asynchronous circuits in fault tolerant system. The Triple Modular Redundancy method works as a fault tolerant system in those same circuits are called thrice, and these three circuit outputs connect majority voter logic. The majority voter logic works based on the at least 2 out of 3 circuits should work functionally with correct output, the system is fault tolerant otherwise system is failed to be fault tolerant. The fault is injected to the circuit-1 as one of the inputs along with other inputs. The circuit-2 and circuit-3 have used inputs with no fault injection. The three circuits generate c1, c2, and c3 outputs and apply majority voter logic (MVL) to generate the final fault tolerant output. The majority voter logic output is 1 2 + 2 3 + 1 3. The benchmark 74X Series circuit's information and design are available in [26] and used in this research work.

RESULTS AND DISCUSSION
An efficient TFI-FTS module for asynchronous circuits results are analyzed in this section. The proposed TFI-FTS work is modeled using Verilog-HDL language. The proposed TFI-FTS for combinational asynchronous circuits is prototyped on the FPGA (Artix-7: XC7A100T-3CSG324) platform. The integration of both TFI and FTS are integrated as a top module named as TFI-FTS and it is synthesized for different 74X-Series combinational circuits [26]. After a place and route operation, the design utilization like area (Slices, LUTs), time, and total power are tabulated in Table 1. The TFI-FTS based 74182 circuit utilizes less chip area in terms of 26 slice registers, 31 Slice LUT's and 27 bonded Input-outputs on FPGA. The TFI-FTS based all the 74X circuit's works at frequency of 606.502 MHz with a minimum period of 1.649ns on FPGA. The combinational path delay of 74182 circuits is 1.467ns, which is minimum than the other three circuits. The total power consumption is generated using the X-Power analyzer tool with FPGA system clock frequency of 100 MHz. The TFI-FTS based 74182 circuit utilizes the 0.089 W total power by adding 0.007 W dynamic power. The efficient TFI-FTS for different 74X-Series combinational circuits are analyzed with fault injection and are tabulated in Table 2. In the simulation environment, set the clock period is 10ns and apply 100 faults from the injection systems to each 74X circuits. The execution time for the whole tolerant process with 100 fault injection is 1000 ns. For each 74X circuits from the FTS, can tolerant the 100 faults with correct functional outputs. The fault coverage of each 74X circuits is achieved 100% with functional outputs.

CONCLUSION AND FUTURE WORK
An efficient transient Fault injection (TFI) and fault tolerant system (FTS) module for asynchronous circuits are designed and prototyped on Artix7 FPGA. The TFI module generates the faults based on the BMA based LFSR along with faulty logic. The fault tolerant system mainly has three asynchronous circuits, TMR based majority voter logic. The TFI-FTS based asynchronous circuits are synthesized on the Xilinx environment. The TFI-FTS based 74X Series Circuits resource constraints like area utilization, obtained timing summary, and Power utilizations are tabulated. The proposed work for 74X series circuits consumes less power on an average of 0.09 Won Artix-7 FPGA. The proposed work is analyzed for 74X series circuits with 100 fault injection and achieved 100% fault coverage. In the future, apply the proposed fault inject system to Benchmarked sequential circuits and analyze the fault coverage's with improvements.