Proceedings Article10.1109/HPCSIM.2011.5999912
Improving basic thread operations with batches of threads
Ioannis E. Venetis
- 04 Jul 2011
- pp 808-813
TL;DR: Batches of Threads use is extended, in order to improve two significant aspects of threading run-time systems, to schedule large numbers of threads to processors and to recycle data structures of threads that have finished execution.
read more
Abstract: Multi-core architectures provide the means to efficiently handle more fine-grained and larger numbers of parallel tasks. However, software still does not take advantage of these new possibilities, retaining the high cost associated with managing large numbers of threads. Batches of Threads have been introduced to reduce this cost and allow applications to express their inherent parallelism in a more fine-grained manner. In this paper, their use is extended, in order to improve two significant aspects of threading run-time systems. Firstly, to schedule large numbers of threads to processors. Secondly, to recycle data structures of threads that have finished execution. Both improvements can be implemented internally in threading run-time systems and thus are transparent to the programmer. The experimental evaluation demonstrates that basic thread operations improve significantly.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
Simultaneous multithreading: maximizing on-chip parallelism
Dean M. Tullsen,Susan J. Eggers,Henry M. Levy +2 more
- 01 May 1995
TL;DR: Simultaneous multithreading has the potential to achieve 4 times the throughput of a superscalar, and double that of fine-grain multi-threading, and is an attractive alternative to single-chip multiprocessors.
Lazy Threads
TL;DR: In this article, the authors describe lazy threads, a new approach for implementing multithreaded execution models on conventional machines, which can implement a parallel call at nearly the efficiency of a sequential call.
88
Is Data Distribution Necessary in OpenMP
Dimitrios S. Nikolopoulos,Theodore S. Papatheodorou,Constantine D. Polychronopoulos,Jesús Labarta,Eduard Ayguadé,eacute +5 more
- 01 Nov 2000
TL;DR: The main body of the paper describes how the OpenMP runtime environment can use page migration for implementing implicit data distribution and redistribution schemes without programmer intervention and provides a proof of concept that there is no need to introduce data distribution directives in OpenMP and warrant the portability of the programming model.
62
A Library Implementation of the Nano-Threads Programming Model
Xavier Martorell,Jesús Labarta,Nacho Navarro,Eduard Ayguadé +3 more
- 26 Aug 1996
TL;DR: The design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user- level is described.
Scheduling algorithms with bus bandwidth considerations for SMPs
Christos D. Antonopoulos,Dimitrios S. Nikolopoulos,T.S. Papatheodorou +2 more
- 27 Oct 2003
TL;DR: This work introduces two scheduling policies that take into account the bus bandwidth consumption of applications and finds that their scheduler is effective with applications of varying bandwidth requirements, from very low to close to the limit of saturation.