Open Access
Low-Cost Embedded Program Loop Caching - Revisited
Lea Hwang Lee,Bill Moyer,John H. Arends +2 more
- 01 Jan 1999
TL;DR: The modified loop caching scheme proposed in this paper is capable of capturing only part of the program loop without having any cache conflict problem, and can reduce instruction fetch energy more than other loop cache schemes previously proposed.
read more
Abstract: Many portable and embedded applications are characterized by spending a large fraction of their execution time on small program loops. In these applications, instruction fetch energy can be reduced by using a small instruction cache when executing these tight loops. Recent work has shown that it is possible to use a small instruction cache without incurring any performance penality [4, 6]. In this paper, we will extend the work done in [6]. In the modified loop caching scheme proposed in this paper, when a program loop is larger than the loop cache size, the loop cache is capable of capturing only part of the program loop without having any cache conflict problem. For a given loop cache size, our loop caching scheme can reduce instruction fetch energy more than other loop cache schemes previously proposed. We will present some quantitative results on how much power can be saved on an integrated embedded design using this scheme.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example
TL;DR: This work designs a loop cache specifically with tuning in mind, showing a 70% reduction in instruction memory access, for MIPS and 8051 processors – representing twice the reduction from a regular loop cache, translating to good power savings.
Energy benefits of a configurable line size cache for embedded systems
Chuanjun Zhang,Frank Vahid,Walid Najjar +2 more
- 20 Feb 2003
TL;DR: This work analyzes the energy impact of different line sizes, for 19 embedded system benchmarks, and shows that tuning the line size to a particular program can reduce memory access energy by 50% in some examples.
Tiny instruction caches for low power embedded systems
TL;DR: It is shown on average that filter caching achieves the best instruction fetch energy reductions of 60--80%, but at the cost of about 20% performance degradation, which could also affect overall energy savings.
A distributed control path architecture for VLIW processors
Hongtao Zhong,Kevin Fan,Scott Mahlke,Michael S. Schlansker +3 more
- 17 Sep 2005
TL;DR: Simulation results show that DVLIW processors reduce the number of cross-chip control signals by approximately two orders of magnitude while incurring a small performance overhead to explicitly manage the instruction streams.
33
Comprehensive frequency-dependent substrate noise analysis using boundary element methods
Hongmei Li,Jorge Carballido,Harry H. Yu,Vladimir Okhmatovski,Elyse Rosenbaum,Andreas C. Cangellaris +5 more
- 10 Nov 2002
TL;DR: In this article, a new and efficient method is introduced for the calculation of the Green's function that can accommodate arbitrary substrate doping profiles and thus facilitate substrate noise analysis using boundary element methods.
References
Cache design trade-offs for power and performance optimization: a case study
Ching-Long Su,Alvin M. Despain +1 more
- 23 Apr 1995
TL;DR: This paper examines performance and power trade-offs in cache designs and the effectiveness of energy reduction for several novel cache design techniques targeted for low power.
Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation
Kanad Ghose,Milind B. Kamble +1 more
- 17 Aug 1999
TL;DR: It is shown that a combination of subbanking, multiple line buffers and bit-line segmentation can reduce the on-chip cache power dissipation by as much as 75% in a technology-independent manner.
Patent
Data processing system having a cache and method therefor
William C. Moyer,John Arends,Lea Hwang Lee +2 more
- 14 Nov 1996
TL;DR: In this paper, a look ahead feature for the valid bit array is provided, such that during a read of the cache, the valid bits for a next instruction is checked with the same index used to read the current instruction, so that the program can remain active as long as the program is in a loop which can be contained entirely within the cache.
41
Patent
Distributed tag cache memory system and method for storing data in the same
William C. Moyer,Lea Hwang Lee,John H. Arends +2 more
- 14 Nov 1996
TL;DR: In this paper, a distributed TAG associated with the instruction address computed by the CPU is used to determine whether instructions stored in the loop cache can be supplied to the CPU, based on whether the GTAG portion of an instruction address is compared to the stored GTAG value.
19
Instruction fetch energy reduction using loop caches for embedded applications with small tight loops
Lea Hwang Lee,Bill Moyer,John H. Arends +2 more
- 17 Aug 1999
TL;DR: This paper proposes using a small instruction buffer, also called a loop cache, to save power in caches, which has no address tag store and knows precisely whether the next instruction request will hit in the loop cache well ahead of time.