TL;DR: In this paper, the authors present a survey of the state-of-the-art in the field of digital integrated circuits, focusing on the following: 1. A Historical Perspective. 2. A CIRCUIT PERSPECTIVE.
Abstract: (NOTE: Each chapter begins with an Introduction and concludes with a Summary, To Probe Further, and Exercises and Design Problems.) I. THE FABRICS. 1. Introduction. A Historical Perspective. Issues in Digital Integrated Circuit Design. Quality Metrics of a Digital Design. 2. The Manufacturing Process. The CMOS Manufacturing Process. Design Rules-The Contract between Designer and Process Engineer. Packaging Integrated Circuits. Perspective-Trends in Process Technology. 3. The Devices. The Diode. The MOS(FET) Transistor. A Word on Process Variations. Perspective: Technology Scaling. 4. The Wire. A First Glance. Interconnect Parameters-Capitance, Resistance, and Inductance. Electrical Wire Models. SPICE Wire Models. Perspective: A Look into the Future. II. A CIRCUIT PERSPECTIVE. 5. The CMOS Inverter. The Static CMOS Inverter-An Intuitive Perspective. Evaluating the Robustness of the CMOS Inverter: The Static Behavior. Performance of CMOS Inverter: The Dynamic Behavior. Power, Energy, and Energy-Delay. Perspective: Technology Scaling and Its Impact on the Inverter Metrics. 6. Designing Combinational Logic Gates in CMOS. Static CMOS Design. Dynamic CMOS Design. How to Choose a Logic Style? Perspective: Gate Design in the Ultra Deep-Submicron Era. 7. Designing Sequential Logic Circuits. Timing Metrics for Sequential Circuits. Classification of Memory Elements. Static Latches and Registers. Dynamic Latches and Registers. Pulse Registers. Sense-Amplifier Based Registers. Pipelining: An Approach to Optimize Sequential Circuits. Non-Bistable Sequential Circuits. Perspective: Choosing a Clocking Strategy. III. A SYSTEM PERSPECTIVE. 8. Implementation Strategies for Digital ICS. From Custom to Semicustom and Structured-Array Design Approaches. Custom Circuit Design. Cell-Based Design Methodology. Array-Based Implementation Approaches. Perspective-The Implementation Platform of the Future. 9. Coping with Interconnect. Capacitive Parasitics. Resistive Parasitics. Inductive Parasitics. Advanced Interconnect Techniques. Perspective: Networks-on-a-Chip. 10. Timing Issues in Digital Circuits. Timing Classification of Digital Systems. Synchronous Design-An In-Depth Perspective. Self-Timed Circuit Design. Synchronizers and Arbiters. Clock Synthesis and Synchronization Using a Phased-Locked Loop. Future Directions and Perspectives. 11. Designing Arithmetic Building Blocks. Datapaths in Digital Processor Architectures. The Adder. The Multiplier. The Shifter. Other Arithmetic Operators. Power and Spped Trade-Offs in Datapath Structures. Perspective: Design as a Trade-off. 12. Designing Memory and Array Structures. The Memory Core. Memory Peripheral Circuitry. Memory Reliability and Yield. Power Dissipation in Memories. Case Studies in Memory Design. Perspective: Semiconductor Memory Trends and Evolutions. Problem Solutions. Index.
TL;DR: In this article, a reconfigurable image processing system with a toroidal topology, distributed memory, and wide bandwidth I/O is described, which is capable of solving real applications at real-time speeds.
Abstract: A powerful, scaleable, and reconfigurable image processing system and method of processing data therein is described. This general purpose, reconfigurable engine with toroidal topology, distributed memory, and wide bandwidth I/O are capable of solving real applications at real-time speeds. The reconfigurable image processing system can be optimized to efficiently perform specialized computations, such as real-time video and audio processing. This reconfigurable image processing system provides high performance via high computational density, high memory bandwidth, and high I/O bandwidth. Generally, the reconfigurable image processing system and its control structure include a homogeneous array of 16 field programmable gate arrays (FPGA) and 16 static random access memories (SRAM) arranged in a partial torus configuration. The reconfigurable image processing system also includes a PCI bus interface chip, a clock control chip, and a datapath chip. It can be implemented in a single board. It receives data from its external environment, computes correspondence, and uses the results of the correspondence computations for various post-processing industrial applications. The reconfigurable image processing system determines correspondence by using non-parametric local transforms followed by correlation. These non-parametric local transforms include the census and rank transforms. Other embodiments involve a combination of correspondence, rectification, a left-right consistency check, and the application of an interest operator.
TL;DR: The results indicate that more than an order of magnitude reduction in power can be achieved over current-day design methodologies while maintaining the system throughput; in some cases this can be accomplished while preserving or reducing the implementation area.
Abstract: The increasing demand for portable computing has elevated power consumption to be one of the most critical design parameters. A high-level synthesis system, HYPER-LP, is presented for minimizing power consumption in application specific datapath intensive CMOS circuits using a variety of architectural and computational transformations. The synthesis environment consists of high-level estimation of power consumption, a library of transformation primitives, and heuristic/probabilistic optimization search mechanisms for fast and efficient scanning of the design space. Examples with varying degree of computational complexity and structures are optimized and synthesized using the HYPER-LP system. The results indicate that more than an order of magnitude reduction in power can be achieved over current-day design methodologies while maintaining the system throughput; in some cases this can be accomplished while preserving or reducing the implementation area. >
TL;DR: The concept of stochastic logic is applied to a reconfigurable architecture that implements processing operations on a datapath and it is found to be much more tolerant of soft errors than conventional hardware implementations.
Abstract: Mounting concerns over variability, defects, and noise motivate a new approach for digital circuitry: stochastic logic, that is to say, logic that operates on probabilistic signals and so can cope with errors and uncertainty. Techniques for probabilistic analysis of circuits and systems are well established. We advocate a strategy for synthesis. In prior work, we described a methodology for synthesizing stochastic logic, that is to say logic that operates on probabilistic bit streams. In this paper, we apply the concept of stochastic logic to a reconfigurable architecture that implements processing operations on a datapath. We analyze cost as well as the sources of error: approximation, quantization, and random fluctuations. We study the effectiveness of the architecture on a collection of benchmarks for image processing. The stochastic architecture requires less area than conventional hardware implementations. Moreover, it is much more tolerant of soft errors (bit flips) than these deterministic implementations. This fault tolerance scales gracefully to very large numbers of errors.
TL;DR: A novel strategy for generating accurate black-box models of datapath power consumption at the architecture level by recognizing that power consumption in digital circuits is affected by activity, as well as physical capacitance.
Abstract: This paper describes a novel strategy for generating accurate black-box models of datapath power consumption at the architecture level. This is achieved by recognizing that power consumption in digital circuits is affected by activity, as well as physical capacitance. Since existing strategies characterize modules for purely random inputs, they fail to account for the effect of signal statistics on switching activity. The dual bit type (DBT) model, however, accounts not only for the random activity of the least significant bits (LSB's), but also for the correlated activity of the most significant bits (MSB's), which contain two's-complement sign information. The resulting model is parameterizable in terms of complexity factors such as word length and can be applied to a wide variety of modules ranging from adders, shifters, and multipliers to register files and memories. Since the model operates at the register transfer level (RTL), it is orders of magnitude faster than gate- or circuit-level tools, but while other architecture-level techniques often err by 50-100% or more, the DBT method offers error rates on the order of 10-15%. >