Black Box Model based Self Healing Solution for Stuck at Faults in Digital Circuits

ABSTRACT


INTRODUCTION
The evolution of very deep submicron technology steadily reduces the feature size of integrated circuits to a large extent and increases the logic to pin ratio to a new high which in turn contributes to higher error rates.The reduced voltage supplies and therefore noise margins, together with reduced internal capacitances increase their susceptibility and sensitivity to radiations thereby making the system error prone [1], [2].Permanent, transient and intermittent faults remain the main sources of errors in digital circuits.Permanent faults exhibit irreversible physical changes on account of wear-out hardware components.The transient faults erupt due to external environmental conditions like cosmic rays and electromagnetic interference whereas intermittent faults arise in light of unstable or marginal hardware and manufacturing residues [3]- [6].
The philosophy of fault tolerance gains crucial significance as it helps to achieve reliable hardware performance in the sense it makes the system insensitive to faults and continue to perform its tasks effectively even in fault prone operating conditions [7], [8].Types of fault tolerant techniques include hardware, software, information and time redundancies.Dependability refers to the science of failure and characterizes by the ability of an entity to satisfy one or more of the vital system functions even under faulty operating conditions [9].Self-healing systems have the ability to modify their own behaviour in response to changes in their environment that lead to system faults [10].The theory of fault injection manifests itself as the validation technique of fault tolerant systems [11], [12] and mandates the designer to study the system's behaviour in the presence of faults introduced deliberately in the system.Three basic categories of fault  ISSN: 2088-8708 IJECE Vol. 7, No. 5, October 2017 : 2451 -2458 2452 injection include hardware implemented fault injection, software implemented fault injection and simulated fault injection.Advantages of simulated fault injection include cost reduction in the design process due to early diagnosis, avoidance of redesign in case of error and thus short time-to-market [13], [14].Very high speed integrated circuits Hardware Description Language [VHDL] based fault injection receives wide spread recognition due to its flexibility combined with a high degree of controllability and observability on all the components of the simulated model [12].
A cost effective non intrusive technique similar to duplication with comparison, wherein duplicated function module and comparator together acts as a function checker to detect any erroneous response of the original function module has been presented in [15].Based on VHDL descriptions, implementation of separable codes for concurrent error detection within VLSI ICs has been described in [16].A suitable approach for generating two level combinational circuits with concurrent error detection capability based on three different techniques has been proposed in [17].
An in-depth review of the literature related to self healing along with a detailed survey and synthesis has been presented using the developed taxonomy in [18].A self healing architecture based on human immune system suitable for tolerating soft errors occurring in VLSI based digital systems has been proposed in [19].The error in the digital circuit has been treated as an antigen by the system and a distributed defence mechanism has been evolved to heal itself from the effect of the error.
Three different architectures using online checkers for error detection which in turn initiates the reconfiguration process of the faulty unit are presented in [20].The modification of fault tolerant architectures into partial reconfigurable modules and the significant advantages of partial dynamic reconfiguration when employed in fault tolerant system design are demonstrated.A hybrid fault tolerant architecture has been proposed in [21] to improve the robustness of logic CMOS circuits.The architecture combines different types of redundancies together to tolerate transient as well as permanent faults.A new hybrid fault-tolerant architecture to improve robustness of digital CMOS circuits and systems has been presented in [22].This architecture also employs information redundancy for error detection, timing redundancy for transient error correction and hardware redundancy for permanent error correction.
Despite all the attempts being made, still there is an exigency for a paradigm shift in the design of fault tolerant digital systems.The scheme proposes healing strategy to counteract the presence of stuck at faults in multi level combinational circuits with a view to nullify their influence on the system.The scheme employs simulated fault injection technique on the basis of its ability to validate the dependability of systems during the design phase.The scheme manages to distant itself from the current literature in the sense it unveils a novel black box modelling approach for the formation of healing mechanism which tolerates single as well as multiple bit faults and makes the system truly fault tolerant.
The rest of the paper organizes itself under three sections that include design methodology, results and discussion and finally conclusion.

DESIGN METHODOLOGY
The primary theory vows to evolve a self healing strategy to survive stuck at faults occurring in combinational logic of any digital circuit.The procedure reiterates its promise to create a thoroughly reliable system with a view to provide fault free output even on the occurrence of faults.It involves the simulated fault injection procedure to inject faults at the interconnect levels of the system and lay down measures to identify their occurrence in order to formulate the healing sequence.The travel moves on using the portals of Modelsim platform to realize the functional status of the proposed architecture and ensures its practical suitability with the help of Xilinx FPGA.
The central theme of the scheme promises to negate the effect of stuck-at faults present at the interconnect levels of combinational circuits and bring out the expected behaviour.The procedure strives to ensure a sense of reliability in the flow of signals and come up with the desired performance.The scheme enjoys the benefit of incorporating a built-in healing procedure that can eliminate the impact of stuck at faults present at the interconnect levels and produce true values on the primary output lines of the system thereby making it a self healing system by its very nature.
Internally, the ROM belongs to the category of a combinational circuit that can be implemented with AND gates connected as a decoder and a number of OR gates equal to the number of outputs.The ROM falls into a two level implementation in sum of minterms form in order that each of its output provides the sum of all the minterms of "n" inputs.The Boolean functions of the CUT expressed in sum of minterms are: The chosen CUT relates to a ROM of size 8X8 with three inputs and eight outputs.The Figures 1 &  2 show the logic diagram of the CUT implemented using 8Χ8 ROM and the block diagram of the self healing mechanism respectively.In general, a model is the one which represents the behaviour of a physical system.A model can either be a black box model or a white box model or the combination of both called as grey box model.In black box modelling, the system is simply considered as a black box in the sense the tester does not possess any prior knowledge over the internal structure and functions of the system but thoroughly knows that a particular input should return a certain invariable output.In other words, the tester is aware of what the system is supposed to do but not of how does it do.The proposed scheme utilizes this fact to build the healing architecture which assumes the role of tester and brings out the desired output based on the already established input-output relationship of the system regardless of the presence of faults at the interconnect levels of the system.It is in this perspective the scheme houses a healing circuit consists of eight EX-OR gates equal to the number of outputs in the first stage of the 8X8 ROM as an integral part of the system.The desired output along with the corresponding eventual outcome from each of the output lines of the decoder together form the inputs to each of the EX-OR gates.The approach extends to follow the outputs of the EX-OR gates and senses the fault when the output of any of the EX-OR gates goes high.It further proceeds to toggle the logic  ISSN: 2088-8708 IJECE Vol. 7, No. 5, October 2017 : 2451 -2458 2454 state of the faulty interconnect line and revert it back to its fault free state.The rigorous process of continuous monitoring of signal flow and the ability to take the corrective action on the fly make the system a truly self healing one.The algorithm seen below enumerates the steps involved in the procedure to harmonize the fault injection mechanism and the healing part with the rest of the system.Algorithm 1. Determine the outputs of the system stage by stage for the given set of primary inputs 2. Check the status of the control signal 3.If "control" is not enabled then 4. Get the fault free outputs of the system without fault injection 5. Else 6. Choose any of the interconnect line(s) in the first stage of the system randomly 7. Inject fault on the chosen line(s) 8. Heal the system with the built in self healing facility 9. Obtain the fault free primary outputs of the system 10.End if

RESULTS AND DISCUSSION
When the CUT operates in fault free state, it produces the output as "11110000" for a given input combination of "100" in accordance with its inputoutput relationship.On the other hand, the appearance of stuck at faults at the intermediate lines causes the circuit to generate faulty output.Figure 3 depicts one such turbulent state of the CUT in which the interconnect levels int(0) and int( 4) are stuck at 1 and stuck at 0 respectively.

Figure 3. The CUT in the presence of stuck at faults
The Table 1 displays the different outputs generated by the circuit for the same input combination in consequence of the presence of stuck at faults occurred at the interconnect levels.4) are s-a-1 & s-a-0 respectively The Modelsim based simulation results displayed in Figures 4 to 7 elucidate the ability of the proposed healing strategy to keep the CUT in its fault tolerant state.The response seen in Figure .4explains the fact that as long as the control signal is at logic 0 state it inhibits the fault injection campaign and the system operates in an error free environment.Figure 4. Output of the CUT when the control signal "con" is at logic 0 However the enabling of the control signal as seen in Figures 5 and 6 permits the introduction of stuck-at faults based on the logic values of the 4 bit fault generator "fg" in the sense at the 500 th ns the 2 th i.e., int(1) interconnect line is made to be stuck-at logic 1 and at the 1000 th ns the 5 nd interconnect line i.e., int(4) to be stuck-at logic 0 respectively.Despite the introduction of faults, the circuit continues to generate the correct information on its primary output lines thanks to the innate ability of the healing mechanism inherently associated with it.The Figure 7 elaborates the retraction of the system into its fault free state as the control signal reverts to logic 0 at the 1500 th ns.The Table 2 highlights the logic states of the signals associated with the CUT in consequence of the sequence of events occurring at different simulation instants.The simulation results clearly show that the proposed self healing architecture exhibits excellent resilience and meets the design specifications even in turbulent situations.The ability of the FPGA to reconfigure itself mandates the designer to incorporate fault tolerant features and helps to realize the logical implementation of the specified design.The real time implementation of the proposed self healing scheme using XC3S500E FPGA on Xilinx foundation series ISE 9.2i platform validates the simulated performance of the VHDL code developed for the CUT and endorses its practical suitability.

Performance Analysis
The Table 3 compares the fault coverage ability and the required area overhead of the proposed scheme with TMR for the CUT.The fault coverage explains the ability of the system to tolerate faults at a given point of time while overhead provides an idea over the amount of additional chip area needed to implement the design.The percentage overhead is calculated using the following relation;  The proposed self healing architecture enjoys a clear edge over TMR due to the fact that the TMR does not yield a cost effective solution.Furthermore, the TMR does not guarantee fault free output because if two or all the three of the inputs to the voter turn out to be faulty, it permits the faulty signal to proceed further in the system.The other important issue concerning TMR relates to the area overhead which goes as high as 368.42% since it requires two identical copies of the CUT and the voter along with the original CUT.On the other hand, the proposed scheme requires only as many XOR gates and NOT gates as the number of intermediate lines of the CUT which in turn results in significantly low area overhead of 84.21% and provides hundred per cent fault coverage against single as well as multiple bit errors.

CONCLUSION
A black box model based self healing scheme has been formulated to detect and correct stuck at faults occurring at the interconnect levels of multi level combinational circuits.The merits of simulated fault injection procedure have been utilized to attain a very high degree of controllability and observability in the process of generating faults.The sense of reliability in the system performance has been ensured through the self healing ability of the proposed methodology.The Modelsim based simulation results obtained for the chosen combinational circuit implemented using ROM add strength to the simplicity and veracity of the proposed approach.
The VHDL code developed for the proposed self healing architecture has been validated through XC3S500E FPGA using Xilinx Foundation series ISE 9.2i with a view to exhibit its suitability for use in practice.The comparative analysis made with the traditional TMR based healing approach brought out the ultimate benefits of the scheme to the light as it comprehensively overpowers the TMR in terms of fault coverage and area overhead.

Figure 1 .
Figure 1.Logic diagram of the CUT implemented using 8X8 ROM

Table 1 .
Faulty operating condition of the CUT * int(1) is stuck at **int(1) and int(

Table 2 .
Logic states of signals at different simulation instants

Table 3 .
Comparison Summary