Load Value Approximation
Joshua San Miguel,Mario Badr,Natalie Enright Jerger +2 more
- 13 Dec 2014
- pp 127-139
TL;DR: This work explores load value approximation, a micro architectural technique to learn value patterns and generate approximations for the data and observes up to 28.6% speedup and 44.1% energy savings on a range of PARSEC workloads, while maintaining low output error.
read more
Abstract: Approximate computing explores opportunities that emerge when applications can tolerate error or inexactness. These applications, which range from multimedia processing to machine learning, operate on inherently noisy and imprecise data. We can trade-off some loss in output value integrity for improved processor performance and energy-efficiency. As memory accesses consume substantial latency and energy, we explore load value approximation, a micro architectural technique to learn value patterns and generate approximations for the data. The processor uses these approximate data values to continue executing without incurring the high cost of accessing memory, removing load instructions from the critical path. Load value approximation can also inhibit approximated loads from accessing memory, resulting in energy savings. On a range of PARSEC workloads, we observe up to 28.6% speedup (8.5% on average) and 44.1% energy savings (12.6% on average), while maintaining low output error. By exploiting the approximate nature of applications, we draw closer to the ideal latency and energy of accessing memory.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Figure 1: Output of bodytrack. 
Figure 2: Load value approximation overview. 
Table I: Precise L1 MPKI and variation in dynamic instruction count when employing load value approximation. 
Figure 11: L1 miss EDP of LVA for different approximation degrees. 
Figure 12: Number of static (distinct) PC values for approximate loads. 
Figure 13: MPKI (normalized to precise execution) for varying floating-point (single) precision loss in fluidanimate.
Citations
A Survey of Techniques for Approximate Computing
TL;DR: A survey of techniques for approximate computing (AC), which discusses strategies for finding approximable program portions and monitoring output quality, techniques for using AC in different processing units, processor components, memory technologies, and so forth, as well as programming frameworks for AC.
Load Value Approximation
Joshua San Miguel,Mario Badr,Natalie Enright Jerger +2 more
- 13 Dec 2014
TL;DR: This work explores load value approximation, a micro architectural technique to learn value patterns and generate approximations for the data and observes up to 28.6% speedup and 44.1% energy savings on a range of PARSEC workloads, while maintaining low output error.
Aergia: Exploiting Packet Latency Slack in On-Chip Networks
Reetuparna Das,Onur Mutlu,Thomas Moscibroda,Chita R. Das +3 more
- 01 Feb 2011
TL;DR: Aergia as mentioned in this paper introduces new router prioritization policies that exploit interfering packets' available slack to improve overall system performance and fairness, and defines slack as a key measure for characterizing a packet's relative importance.
Approximate Communication: Techniques for Reducing Communication Bottlenecks in Large-Scale Parallel Systems
TL;DR: Compression and approximate value prediction show great promise for reducing the communication bottleneck in bandwidth-constrained applications, while relaxed synchronization is found to provide large speedups for select error-tolerant applications, but suffers from limited general applicability and unreliable output degradation guarantees.
122
Doppelgänger: a cache for approximate computing
Joshua San Miguel,Jorge Albericio,Andreas Moshovos,Natalie Enright Jerger +3 more
- 05 Dec 2015
TL;DR: The Doppelganger cache associates the tags of multiple similar blocks with a single data array entry to reduce the amount of data stored and achieves reductions in LLC area, dynamic energy and leakage energy without harming performance nor incurring high application error.
122
References
Pin: building customized program analysis tools with dynamic instrumentation
Chi-Keung Luk,Robert Cohn,Robert Muth,Harish Patil,Artur Klauser,Geoff Lowney,Steven Wallace,Vijay Janapa Reddi,Kim Hazelwood +8 more
- 12 Jun 2005
TL;DR: The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Fact and Fantasy in the Use of Options
TL;DR: In this paper, Fact and Fantasy in the Use of Options: Fact and fantasy in the use of options, the authors present a survey of options and their use in financial markets.
1.1K
Benchmarking modern multiprocessors
Kai Li,Christian Bienia +1 more
- 01 Jan 2011
TL;DR: A methodology to design effective benchmark suites is developed and its effectiveness is demonstrated by developing and deploying a benchmark suite for evaluating multiprocessors called PARSEC, which has been adopted by many architecture groups in both research and industry.
1.1K
Minimizing power consumption in digital CMOS circuits
Anantha P. Chandrakasan,Robert W. Brodersen +1 more
- 01 Apr 1995
TL;DR: An approach is presented for minimizing power consumption for digital systems implemented in CMOS which involves optimization at all levels of the design and has been applied to the design of a chipset for a portable multimedia terminal that supports pen input, speech I/O and full-motion video.
A detailed and flexible cycle-accurate Network-on-Chip simulator
Nan Jiang,Daniel U. Becker,George Michelogiannakis,James Balfour,Brian Towles,David E. Shaw,John Kim,William J. Dally +7 more
- 21 Apr 2013
TL;DR: The simulator, BookSim, is designed for simulation flexibility and accurate modeling of network components and offers a large set of configurable network parameters in terms of topology, routing algorithm, flow control, and router microarchitecture, including buffer management and allocation schemes.
804