TL;DR: An asynchronous analog-to-digital converter based on successive approximation is used to provide a high-speed (600-MS/s) and medium-resolution (6-bit) conversion which allows its use in RF subsampling applications.
Abstract: An asynchronous analog-to-digital converter (ADC) based on successive approximation is used to provide a high-speed (600-MS/s) and medium-resolution (6-bit) conversion. A high input bandwidth (>4 GHz) was achieved which allows its use in RF subsampling applications. By using asynchronous processing techniques, it avoids clocks at higher than the sample rate and speeds up a nonbinary successive approximation algorithm utilizing a series nonbinary capacitive ladder with digital radix calibration. The sample rate of 600 MS/s was achieved by time-interleaving two single ADCs, which were fabricated in a 0.13-mum standard digital CMOS process. The ADC achieves a peak SNDR of 34 dB, while only consuming an active area of 0.12mm2 and having power consumption of 5.3 mW
TL;DR: A new method for digital true random number generation based on asynchronous logic circuits with feedback based on the so-called Galois and Fibonacci ring oscillators is introduced and a concrete technique using a self-clock-controlled linear feedback shift register is proposed.
Abstract: A new method for digital true random number generation based on asynchronous logic circuits with feedback is introduced. In particular, a concrete technique using the so-called Galois and Fibonacci ring oscillators is developed and analyzed both theoretically and experimentally. The generated random binary sequences may have a very high speed and a higher and more robust entropy rate in comparison with previous proposals for digital random number generators. A new method for digital postprocessing of random data based on irregularly clocked nonautonomous synchronous logic circuits with feedback is also introduced and a concrete technique using a self-clock-controlled linear feedback shift register is proposed. The postprocessing can provide both randomness extraction and computationally secure speed increase of input random data
TL;DR: In this paper, a configurable integrated circuit (IC) is described, which includes a logic circuit for receiving input data sets and configuration data sets, and a connection circuit for supplying sets of the configuration data to the logic circuit at a particular rate for at least a particular time period.
Abstract: Some embodiments of the invention provide a configurable integrated circuit (IC). The IC includes a logic circuit for receiving input data sets and configuration data sets and performing several functions on the input data sets. Each configuration data set specifies a particular function that the logic circuit has to perform on the input data set. The IC also includes a connection circuit for supplying sets of the configuration data to the logic circuit at a particular rate for at least a particular time period. At least two supplied configuration data sets are different and configure the logic circuit to perform two different functions on the input data.
TL;DR: In this paper, the authors present a dynamic performance adjustment control circuit that adjusts the clock frequency and voltage at which the logic circuit operates to a relatively higher frequency or voltage for tasks required to be performed in a shorter duration of time and a relatively lower frequency/voltage for tasks with longer timing tolerances.
Abstract: A dynamic performance circuit adjustment system and method that flexibly adjusts the performance of a logic circuit. The dynamic performance circuit adjustment system and method facilitates flexible power conservation. In one exemplary implementation, a dynamic performance adjustment control circuit controls performance adjustments to a logic circuit (e.g., a processor) and adjusts support functions for the logic circuit. The logic circuit performs operational functions (e.g., processing) or tasks that have different performance requirements. For example, some tasks performed by the logic circuit are required to be performed in a relatively short duration of time and other tasks performed by logic circuit have relatively longer time limitations. The dynamic performance adjustment control circuit adjusts the clock frequency and voltage at which the logic circuit operates to a relatively greater frequency and voltage for tasks required to be performed in a shorter duration of time and adjusts the frequency and voltage at which the logic circuit operates to a relatively lower frequency and voltage for tasks with longer timing tolerances. The dynamic performance adjustment system and method includes provisions to manage a transition in performance and support functions in a manner that reduces the risk of spurious signals or “glitches.”
TL;DR: A behavior diagnosis enables us to propose hardening techniques that improve fault tolerance and resistance and aim at exploiting quasi-delay insensitive (QDI) circuit properties to significantly harden the architecture with a very low area overhead and a reasonable performance penalty.
Abstract: This paper presents hardening techniques against fault attacks and the practical evaluation of their efficiency. The circuit technology investigated to improve the resistance against fault attacks is asynchronous logic. Specific properties of asynchronous circuits make them inherently resistant against a large class of faults. An analysis of their behavior in the presence of faults shows that they are an interesting alternative to design robust systems. A behavior diagnosis enables us to propose hardening techniques that improve fault tolerance and resistance. They are applied at design time and aim at exploiting quasi-delay insensitive (QDI) circuit properties to significantly harden the architecture with a very low area overhead and a reasonable performance penalty. To validate these techniques, a hardened DES crypto-processor is presented. The countermeasures are evaluated using laser beam fault injection
TL;DR: In this paper, an integrated circuit for pattern detection including an arithmetic logic unit coupled to a comparison circuit, where the arithmetic unit is programmed by an opcode, and a selected pattern of a plurality of patterns selected by a first multiplexer is coupled to the comparison circuit.
Abstract: An integrated circuit for pattern detection including: an arithmetic logic unit coupled to a comparison circuit, where the arithmetic logic unit is programmed by an opcode; a selected pattern of a plurality of patterns selected by a first multiplexer, where the first multiplexer is coupled to the comparison circuit; and a register coupled to the comparison circuit for storing at least a partial comparison between an output of the arithmetic logic unit and the selected pattern.
TL;DR: This paper presents a new method for at-speed structural test of ASICs, having no tight restrictions on the circuit design, and describes a method to test asynchronous clock domains simultaneously.
Abstract: At-speed test of integrated circuits is becoming critical to detect subtle delay defects. Existing structural at-speed test methods are inadequate because they are unable to supply sufficiently-varied functional clock sequences to test complex sequential logic. Moreover, they require tight restrictions on the circuit design. In this paper, we present a new method for at-speed structural test of ASICs, having no tight restrictions on the circuit design. In the present implementation, any complex at-speed functional clock waveform for 16 cycles can be applied. We present DFT structures that can generate high-speed launch-off-capture as well as launch-off-scan clocking without the need to switch a scan enable at-speed. We also describe a method to test asynchronous clock domains simultaneously. Experimental results on fault coverage and hardware measurements for three multi-million gate ASICs demonstrate the feasibility of the proposed approach.
TL;DR: This work has developed a reconfigurable dataflow architecture that exploits some of the unique features of asynchronous logic, and attains a performance that significantly exceeds previous asynchronous FPGAs.
Abstract: Challenges in mapping asynchronous logic to a flexible substrate include developing a balance between circuit-level flexibility, mapping complexity, and logic overhead. We have developed a reconfigurable dataflow architecture that addresses these challenges, and have also created the necessary synthesis flow required to map designs to the architecture. The architecture exploits some of the unique features of asynchronous logic, and attains a performance that significantly exceeds previous asynchronous FPGAs.
TL;DR: In this article, the duty ratio of a delay line circuit is rendered variable by independently selecting the rising edge of the input signal and a propagation path of the falling edge, and the output signal of a preceding stage is sent to a following stage.
Abstract: A delay circuit includes a first delay line circuit having a plurality of stages of delay units, a second delay line circuit having a plurality of stages of delay units, a plurality of transfer circuits provided in association with respective stages of the delay units of the first delay line circuit, the transfer circuits controlling the transfer of the outputs of the delay units of the first delay line circuit to associated stages of the delay units of the second delay line circuit. The delay units of respective stages of the first delay line circuit inverting input signals. Each stage delay unit of the second delay line circuit includes a logic circuit receiving an output signal of the transfer circuit associated with the delay unit in question and an output signal of a preceding stage to send an output signal to a following stage. The duty ratio is rendered variable by independently selecting the rising edge of the input signal and a propagation path of the falling edge.
TL;DR: The paper first shows that the basic pipeline optimization problem for asynchronous circuits is NP-complete, then it presents an efficient branch and bound algorithm that finds the optimal pipeline configuration.
Abstract: This paper addresses the problem of identifying the minimum pipelining needed in an asynchronous circuit (e.g., number/size of pipeline stages/latches required) to satisfy a given performance constraint, thereby implicitly minimizing area and power for a given performance. The paper first shows that the basic pipeline optimization problem for asynchronous circuits is NP-complete. Then, it presents an efficient branch and bound algorithm that finds the optimal pipeline configuration. The experimental results on a few scalable system models demonstrate that this algorithm is computationally feasible for moderately sized models.
TL;DR: The design demonstrates that the STFB template can yield three times higher throughput with approximately half of the area of comparable quasi-delay-insensitive (QDI) templates, requires less timing assumptions than ultra-high-speed GasP bundled-data circuits, and can be designed with an automated place and route flow.
Abstract: This paper presents a high-performance asynchronous template, single-track full-buffer (STFB), which achieves close to full-custom performance using a standard cell design flow and industry standard CAD tools to perform schematic capture, simulation, cell layout, and automatic placement and routing. This template and flow is demonstrated and evaluated with the implementation of a 64-bit asynchronous prefix adder, and its test circuitry, using the TSMC 0.25-/spl mu/m process. The 64-bit asynchronous prefix adder layout requires 0.96 mm/sup 2/ and the entire 260-k transistor test chip reaches a measured throughput of 1.45GHz. The design demonstrates that the STFB template can yield three times higher throughput with approximately half of the area of comparable quasi-delay-insensitive (QDI) templates, requires less timing assumptions than ultra-high-speed GasP bundled-data circuits, and can be designed with an automated place and route flow.
TL;DR: In this article, an integrated circuit and method of reviewing values of one or more signals occurring within that integrated circuit, is provided, and the integrated circuit comprises processing logic for executing a program, and monitoring logic for reviewing values.
Abstract: An integrated circuit, and method of reviewing values of one or more signals occurring within that integrated circuit, are provided. The integrated circuit comprises processing logic for executing a program, and monitoring logic for reviewing values of one or more signals occurring within the integrated circuit as a result of execution of the program. The monitoring logic stores configuration data, which can be software programmed in relation to the signals to be monitored. Further, the monitoring logic makes use of a Bloom filter which, for a value to be reviewed, performs a hash operation on that value in order to reference the configuration data to determine whether that value is either definitely not a value within the range or is potentially a value within the range of values. If the value is determined to be within the set of values, then a trigger signal is generated which can be used to trigger a further monitoring process.
TL;DR: In this article, a digital signal processing circuit with a pre-adder circuit coupled to a multiplier circuit and a set of multiplexers is defined, where the set of multi-xers are controlled by an opcode.
Abstract: A digital signal processing circuit having a pre-adder circuit includes; a first register block and a pre-adder circuit coupled to a multiplier circuit and to a set of multiplexers, where the set of multiplexers are controlled by an opcode, and where the pre-adder circuit has a first adder circuit; and an arithmetic logic unit (ALU) having a second adder circuit and coupled to the set of multiplexers.
TL;DR: In this paper, the first phase comparison circuit detects a phase difference between the first clock signal and an output signal of the first delay line circuit and a test clock signal of which frequency is lower than the first signal and a signal after dividing the output signal.
Abstract: A DLL circuit includes a first delay line circuit, a first phase comparison circuit, a control circuit, and a first selecting circuit. The first delay line circuit can change a delay amount and provide a delay to a first clock signal. The first phase comparison circuit can detect a phase difference between the first clock signal and an output signal of the first delay line circuit, and a phase difference between a test clock signal of which frequency is lower than the first clock signal and an output signal of the first delay line circuit or a signal after dividing the output signal. The control circuit controls a delay amount of the first delay line circuit according to the detection result of the first phase comparison circuit. The first selecting circuit selectively inputs one of the output signal of the first delay line circuit or an inverted signal thereof and the first clock signal to the first delay line circuit.
TL;DR: In this article, a delay test circuit was proposed to generate output clock pulses by removing an optional one from equal to or more than three continuing clock pulses of an input clock signal, and supplying the output clock pulse to the input side flip-flop and the output side flipflop.
Abstract: A semiconductor integrated circuit includes an input side flip-flop; a combinational circuit having an input connected with the input side flip-flop; an output side flip-flop connected with an output of the combinational circuit; and a delay test circuit. The delay test circuit generates output clock pulses by removing an optional one from equal to or more than 3 continuing clock pulses of an input clock signal, and supplies the output clock pulse to the input side flip-flop and the output side flip-flop.
TL;DR: The new algorithms are the first systematic and general mapping approach for asynchronous threshold networks, targeting delay or area, which preserve the timing-robustness properties of the initial unoptimized circuits.
Abstract: A key challenge in using robust asynchronous circuit styles is the lack of powerful automated optimization techniques. In this paper, optimal technology mapping and cell merger algorithms for robust asynchronous threshold networks are introduced. The technology mapping algorithm is the first systematically to target either delay or area, without destroying the hazard-freedom properties of the initial unoptimized circuits. Both algorithms were implemented and experiments were performed on a near-complete industrial DES circuit provided by Theseus logic, using a particular asynchronous threshold circuit style called NCL (null convention logic), which had been already optimized in a commercial asynchronous synthesis flow based on constrained use of synchronous CAD tools. The average delay improvements for the three largest subcircuits (with over 400 inputs and outputs each) ranged from 20.0-26.7% for technology mapping and 12.6-16.4% for cell merger. When only the single longest path delay of the largest subcircuits is considered, the worst-case delay improvements ranged from 26.0-26.4% for technology mapping and 24.3-26.4% for cell merger. Though the proposed methods are applied in the NCL design flow, the contribution is general enough to be used for other robust asynchronous threshold circuit styles.
TL;DR: An adaptive predictive clock synchronizer for systems on chip incorporating multiple clock domains is presented, taking advantage of the periodic nature of clocks in order to predict potential conflicts in advance and to conditionally employ an input sampling delay to avoid such conflicts.
Abstract: An adaptive predictive clock synchronizer for systems on chip incorporating multiple clock domains is presented. The synchronizer takes advantage of the periodic nature of clocks in order to predict potential conflicts in advance, and to conditionally employ an input sampling delay to avoid such conflicts. The result is conflict-free synchronization with maximal throughput and minimal latency. The adaptive predictive synchronizer adjusts automatically to a wide range of clock frequencies, regardless of whether the transmitter is faster or slower than the receiver. The synchronizer also avoids sampling duplicate data or missing any input. A novel method is presented for formal treatment of synchronizers and metastability. Correct operation of the synchronizer is formally proven and verified.
TL;DR: Two novel circuits, clock synchronizer and reduced swing inverter are proposed to design dynamic and static level converters for sub-threshold logic.
Abstract: The large supply voltage difference between sub-threshold core logic and I/O makes it extremely challenging to convert signals from core circuit to I/O circuit. In this paper, we propose two novel circuits, clock synchronizer and reduced swing inverter to design dynamic and static level converters for sub-threshold logic. Circuit simulations shows that our level converters work at frequency > 500kHz between 20degC and 40degC with a supply voltage of 0.25V
TL;DR: In this article, a pulse latch circuit that operates in sync with a pulsed clock signal, including a first operation mode in which shifting test pattern scan data is performed and a second operation mode where shifting the test pattern scans data is not performed, is presented.
Abstract: The disclosed invention is intended to decrease the power consumption of a pulse latch circuit. A pulse latch circuit that operates in sync with a pulsed clock signal, including a first operation mode in which shifting test pattern scan data is performed and a second operation mode in which shifting the test pattern scan data is not performed, comprises the following circuits: a first latch circuit that is able to latch input data in sync with the clock signal; a second latch circuit that is connected to the first latch circuit and is able to latch the test pattern scan data to be shifted in sync with the clock signal; and a control circuit that stops supply of the clock signal to the second latch circuit during the second operation mode. By thus stopping the supply of the clock signal to the second latch circuit, decrease the power consumption is achieved.
TL;DR: The design and simulation of a digital fuzzy logic controller applicable for nonlinear systems based on a new strategy in which analog advantages such as low die area, high speed, and simplicity are added to the total digital system advantages, with unchanged digital system properties is considered.
Abstract: The design and simulation of a digital fuzzy logic controller applicable for nonlinear systems based on a new strategy in which analog advantages such as low die area, high speed, and simplicity are added to the total digital system advantages, with unchanged digital system properties, is considered in this paper. For implementing this idea, a new programmable fuzzifier circuit has been designed for a 5-bit digital input signal and membership degree as an analog current with 5-bit resolution in the range of 58 μ A . It has also been presented a new high-accurate simple current-mode circuit for Max block. The controller circuit was implemented in an area less than 0.11 mm 2 in 0.35 μ m , CMOS technology. A controller of two inputs, nine rules, and one output simulated with MATLAB systematically, and the total controller circuit simulated with HSPICE and the Layouts were extracted with Magic. The inference speed of the controller is about 8.85 MFLIPS .
TL;DR: This paper is the first attempt to the high-level synthesis of non-zero clock skew circuits and forms the problem of register binding for clock period minimization to find a minimum-period register binding solution.
Abstract: In modern high-speed circuit design, the clock skew has been widely utilized as a manageable resource to improve the circuit performance. However, in high-level synthesis stage, the circuit is never optimized for the utilization of clock skew. This paper is the first attempt to the high-level synthesis of non-zero clock skew circuits. First, we show that the register binding in high-level synthesis stage has a significant impact on the clocking constraints between registers. As a result, different register binding solutions lead to different smallest feasible clock periods. Then, based on that observation, we formulate the problem of register binding for clock period minimization. Given a constraint on the number of registers, our objective is to find a minimum-period register binding solution. Experimental data show that, in most benchmark circuits, the lower bound of the clock period can be achieved without any extra overhead on the number of registers.
TL;DR: In this article, a synchronous semiconductor circuit with two or more clock sources and a power management controller is presented, where the power management controllers are operable to apply power to one of the clock sources, and then select another of the sources for synchronization of the circuit.
Abstract: Various systems and methods for power management are disclosed herein. For example, a synchronous semiconductor circuit is disclosed that includes two or more clock sources and a power management controller. The power management controller is operable to apply power to one of the clock sources and to select another of the clock sources for synchronization of the circuit. Then, upon stabilization of the first clock source, it is selected by the power management controller to synchronize the circuit.
TL;DR: A random number generation circuit has a ring oscillator which has odd number of inverting amplifiers connected in ring shape, a delay control circuit which generates a predetermined clock signal by delaying a reference clock signal, a first sampling circuit which samples an oscillation signal generated by the ring oscillators with the predetermined clock signals, a logical equalization circuit which equalizes occurrence frequency of a sampling signal sampled by the first sampling circuits, a linear feedback shift register (LFSR), and a serial-parallel converter which generates random parallel data used for controlling a delay amount of the delay
Abstract: A random number generation circuit has a ring oscillator which has odd number of inverting amplifiers connected in ring shape, a delay control circuit which generates a predetermined clock signal by delaying a reference clock signal, a first sampling circuit which samples an oscillation signal generated by the ring oscillator with the predetermined clock signal, a first logical equalization circuit which equalizes occurrence frequency of “0” and “1” of a sampling signal sampled by the first sampling circuit, a linear feedback shift register (LFSR) which generates random serial data based on an output signal of the first logical equalization circuit, and a serial-parallel converter which generates random parallel data used for controlling a delay amount of the delay control circuit by converting the random serial data from serial to parallel.
TL;DR: A GHz-class dynamic charge-recovery logic is implemented with an on-chip clock generator and integrated inductor in a 0.13mum CMOS process and recovers 60% of total circuit energy every cycle.
Abstract: A GHz-class dynamic charge-recovery logic is implemented with an on-chip clock generator and integrated inductor in a 0.13mum CMOS process. The chip operation is verified at clock frequencies up to 1.3GHz. At its natural frequency, the design recovers 60% of total circuit energy every cycle
TL;DR: In this paper, a circuit design system, methodology, and software are disclosed for generating circuit capable of consuming less dynamic power than edge-triggered flip-flops using pulsed latches driven by pulse generators.
Abstract: A circuit design system, methodology, and software are disclosed for generating circuit capable of consuming less dynamic power. In particular, the circuit design methodology entails modifying an initial circuit design including a clock network coupled to a plurality of edge-triggered flip-flops to generate a modified circuit design that uses pulsed latches driven by pulse generators in place of at least some of the flip-flops. Since pulsed latches use less dynamic power than edge-triggered flip-flops, the modified circuit may consume less dynamic power. The circuit design methodology may further entail adding delay cells for balancing the clock network to compensate for timing effects caused by the insertion of pulse generators. Additionally, the methodology may further include cloning of forbidden clock paths to make more flip-flops eligible for pulsed latch replacement.
TL;DR: A novel asynchronous fine-grain pipeline synthesis methodology that allows synthesis of asynchronous quasi delay insensitive circuits from standard high-level hardware description language (HDL) specifications.
Abstract: Balanced dynamic dual-rail gates and asynchronous circuits have been shown, if implemented correctly, to have natural and efficient resistance to side-channel attacks. Despite their benefits for security applications they have not been adapted to current mainstream designs due to the lack of electronic design automation support and their non-standard or proprietary design methodologies. We present a novel asynchronous fine-grain pipeline synthesis methodology that addresses these limitations. It allows synthesis of asynchronous quasi delay insensitive circuits from standard high-level hardware description language (HDL) specifications. We briefly present a proof of concept differential dynamic power balanced micropipeline library cells that are approximately 6 times more balanced than the best (differential dynamic) cells designed using previous balancing methods. An implementation of the Advanced Encryption Standard based on these balanced cells and synthesized using our tool flow shows a 6.6 times throughput improvement over the synchronous automatically pipelined implementation using the same TSMC 0.18μm technology synthesized from the same HDL specification.
TL;DR: The authors present a novel circuit implementation of the advanced encryption standard using self-timed dual-rail technology that reduces leakage of internal information through balanced power consumption, which is achieved by avoidance of glitches and by data-independent switching behaviour.
Abstract: The authors present a novel circuit implementation of the advanced encryption standard using self-timed dual-rail technology. The design reduces leakage of internal information through balanced power consumption, which is achieved by avoidance of glitches and by data-independent switching behaviour. The design utilises a pipeline structure with built-in controllers and novel, highly balanced security latches.
TL;DR: The results show that the circuit can correct the duty-cycle of an 8-GHz clock with plusmn0.8% accuracy for an input range of 25% to 75%.
Abstract: We present a circuit to control duty-cycle of high-frequency clocks with very fine resolution. The proposed duty-cycle detection and correction circuits are digital and do not require external references and matching devices. The circuits are designed to compensate for duty-cycle uncertainties in a floating point unit implemented using limited switch dynamic logic (LSDL) (Belloumini, 2005). The results show that the circuit can correct the duty-cycle of an 8-GHz clock with plusmn0.8% accuracy for an input range of 25% to 75%
TL;DR: In this paper, a test circuit consisting of a delay circuit 11 with controllable delay, a phase comparator circuit 12 for comparing the phases between the clock signal S 0 and a delay signal S 1 delayed from the clock signals S 0 by the delay circuit, a meas counter 13 for counting the number of outputs of the prescribed comparison result from the phase comparators circuit 12, a signal switching circuit 14 for switching an input signal to the delay circuits from a clock signal from S 0 to a signal satisfying an oscillation condition where the delay signal is received from the delaycircuit 11
Abstract: A test circuit comprises a delay circuit 11 with controllable delay, a phase comparator circuit 12 for comparing the phases between the clock signal S 0 and a delay clock signal S 1 delayed from the clock signal S 0 by the delay circuit 11, a meas counter 13 for counting the number of outputs of the prescribed comparison result from the phase comparator circuit 12, a signal switching circuit 14 for switching an input signal to the delay circuit 11 from the clock signal S 0 to a delay signal satisfying an oscillation condition where the delay signal is received from the delay circuit 11 and developing a ring oscillator, and a frequency measuring circuit 15 for measuring an oscillation frequency when the ring oscillator is developed, the delay circuit 11 includes a variable delay circuit 17 with variable delay units connected to control the delay in each variable delay units independently.
TL;DR: In this article, a method and system for debugging using replicated logic and trigger logic is described, where a representation of a circuit is compiled and a portion of the circuit is selected for replication.
Abstract: A method and system for debugging using replicated logic and trigger logic is described. A representation of a circuit is compiled. One or more signals are selected for triggering and trigger logic is inserted into the circuit. A portion of the circuit is selected for replication. The selected portion of the circuit is replicated and delay logic is inserted to delay the inputs into the replicated portion of the circuit. The representation of the circuit is recompiled and programmed into a hardware device. A debugger may then be invoked. One or more of the triggering signals are selected. For each selected triggering signal, one or more states are selected to setup a trigger condition. The hardware device may then be run. The replicated portion of the circuit will be paused when the trigger condition occurs. The states of registers in the replicated portion of the circuit and the sequence of steps that led to the trigger condition may then be recorded.