Proceedings Article10.1145/1450095.1450127
Multi-granularity sampling for simulating concurrent heterogeneous applications
Melhem Tawk,Khaled Z. Ibrahim,Smail Niar +2 more
- 19 Oct 2008
- pp 217-226
TL;DR: An approach that uses the sampling technique to speed up the design flow of Multiprocessor System-on-Chip (MPSoC) systems and shows that the simulation of concurrent heterogeneous applications can be accelerated by a factor of up to 60x, while maintaining an average performance estimation error lower than 5%.
read more
Abstract: Detailed or cycle-accurate/bit-accurate (CABA) simulation is a critical phase in the design flow of embedded systems However, with increasing system complexity, full detailed simulation is prohibitively slower than the hardware being simulated In this paper, we present an approach that uses the sampling technique to speed up the design flow of Multiprocessor System-on-Chip (MPSoC) systems Based on the dynamic behavior of the applications running concurrently, our method dynamically chooses between multiple granularities of the sampling phase The similarities of the execution phases for all possible granularities are first analyzed, then transitions between phase overlaps are discretized To facilitate the detection of repetitions, one phase, with an appropriate granularity, is chosen per process Unlike most other proposals, the associated performance is usually accurate enough not to need repeated resampling The use of checkpointing in conjunction with our approach is simplified because the amount of the needed disk space is significantly reduced Experimental results show that the simulation of concurrent heterogeneous applications can be accelerated by a factor of up to 60x, while maintaining an average performance estimation error lower than 5%
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A fast and effective dynamic trace-based method for analyzing architectural performance
Yi-Siou Chen,Lih-Yih Chiou,Hsun-Hsiang Chang +2 more
- 25 Jan 2011
TL;DR: This work presents an architectural performance analysis using a dynamic trace-based method (APDT) to reduce the effort required for system remodeling and the time required to estimate performance during architecture exploration, thereby improving the effectiveness of that exploration.
2
Parallel application sampling for accelerating MPSoC simulation
TL;DR: A new application sampling technique is proposed to accelerate the simulation of MPSoC design space exploration (DSE) by dynamically combines simultaneously executed phases, thus generating a sampling unit.
2
Parallel Application Sampling in MPSoC Simulation
Melhem Tawk,Khaled Z. Ibrahim,Smail Niar +2 more
- 09 Aug 2010
TL;DR: A new application sampling technique is proposed to accelerate the simulation of MPSoC design space exploration (DSE) by dynamically combines simultaneously executed phases, thus generating a sampling unit.
•Proceedings Article
Concurrent phase classification for accelerating MPSoC simulation
Melhem Tawk,Khaled Z. Ibrahim,Smail Niar +2 more
- 21 Jun 2012
TL;DR: Experimental results demonstrate that the technic can reduce simulation time and the memory size required to store these checkpoints for accelerating MPSoC simulation and allow exploration of a large space of alternative designs in the DSE.
References
MiBench: A free, commercially representative embedded benchmark suite
Matthew R. Guthaus,Jeff Ringenberg,Daniel J. Ernst,Todd Austin,Trevor Mudge,Richard B. Brown +5 more
- 02 Dec 2001
TL;DR: A new version of SimpleScalar that has been adapted to the ARM instruction set is used to characterize the performance of the benchmarks using configurations similar to current and next generation embedded processors.
3.7K
Automatically characterizing large scale program behavior
Timothy Sherwood,Erez Perelman,Greg Hamerly,Brad Calder +3 more
- 01 Oct 2002
TL;DR: This work quantifies the effectiveness of Basic Block Vectors in capturing program behavior across several different architectural metrics, explores the large scale behavior of several programs, and develops a set of algorithms based on clustering capable of analyzing this behavior.
Transaction level modeling: flows and use models
Adam P. Donlin
- 08 Sep 2004
TL;DR: In this article, a variety of transaction-level models (TLM use-models) have been proposed to reveal paths through the TLM abstraction levels for various types of system and the distribution of modeling effort between the various design roles and apply that to descriptions of various use-model design flows.
156
System-level exploration for Pareto-optimal configurations in parameterized system-on-a-chip
TL;DR: The approach extensively prunes the potentially large configuration space by taking advantage of parameter dependencies and has successfully incorporated into the parameterized SOC tuning environment (Platune) and applied it to a number of applications.
System-level exploration for pareto-optimal configurations in parameterized systems-on-a-chip
Tony Givargis,Frank Vahid,Jorg Henkel +2 more
- 04 Nov 2001
TL;DR: This work provides a technique for efficiently exploring the configuration space of a parameterized system-on-a-chip (SOC) architecture to find all Pareto-optimal configurations, and extensively prunes the potentially large configuration space by taking advantage of parameter dependencies.
146