Implicit parallelism

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Book Chapter•10.1016/B978-0-08-094832-4.50018-0•

Real-Coded Genetic Algorithms and Interval-Schemata

[...]

Larry Eshelman¹, J. David Schaffer¹•Institutions (1)

Philips¹

1 Jan 1993

TL;DR: It is shown how interval-schemata are analogous to Holland's symbol- schemata and provide a key to understanding the implicit parallelism of real-valued GAs and support the intuition that real-coded GAs should have an advantage over binary coded GAs in exploiting local continuities in function optimization.

...read moreread less

Abstract: In this paper we introduce interval-schemata as a tool for analyzing real-coded genetic algorithms (GAs). We show how interval-schemata are analogous to Holland's symbol-schemata and provide a key to understanding the implicit parallelism of real-valued GAs. We also show how they support the intuition that real-coded GAs should have an advantage over binary coded GAs in exploiting local continuities in function optimization. On the basis of our analysis we predict some failure modes for real-coded GAs using several different crossover operators and present some experimental results that support these predictions. We also introduce a crossover operator for real-coded GAs that is able to avoid some of these failure modes.

...read moreread less

1,760 citations

Proceedings Article•10.1145/106972.106991•

Limits of instruction-level parallelism

[...]

David W. Wall

1 Apr 1991

TL;DR: The results of simulations of 18 different test programs under 375 different models of available parallelism analysis are presented, showing how simulations based on instruction traces can model techniques at the limits of feasibility and even beyond.

...read moreread less

Abstract: Growing interest in ambitious multiple-issue machines and heavilypipelined machines requires a careful examination of how much instructionlevel parallelism exists in typical programs. Such an examination is complicated by the wide variety of hardware and software techniques for increasing the parallelism that can be exploited, including branch prediction, register renaming, and alias analysis. By performing simulations based on instruction traces, we can model techniques at the limits of feasibility and even beyond. This paper presents the results of simulations of 18 different test programs under 375 different models of available parallelism analysis. This paper replaces Technical Note TN-15, an earlier version of the same material.

...read moreread less

740 citations

Proceedings Article•10.1145/121132.121151•

Scheduler activations: effective kernel support for the user-level management of parallelism

[...]

Thomas Anderson¹, Brian N. Bershad¹, Edward D. Lazowska¹, Henry M. Levy¹•Institutions (1)

University of Washington¹

1 Sep 1991

TL;DR: It is argued that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations, and that managing parallelism at the user level is essential to high-performance parallel computing.

...read moreread less

Abstract: Threads are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead.Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; we thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-level threads is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; we thus argue that kernel threads or processes, as currently conceived, are the wrong abstraction on which to support user-level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism.

...read moreread less

629 citations

Journal Article•10.1109/12.48862•

Executing a program on the MIT tagged-token dataflow architecture

[...]

Arvind¹, Rishiyur S. Nikhil¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 1990-IEEE Transactions on Computers

TL;DR: An overview of current thinking on dataflow architecture is provided by describing example Id programs, their compilation to dataflow graphs, and their execution on the TTDA, a multiprocessor architecture.

...read moreread less

Abstract: The MIT Tagged-Token Dataflow Project has an unconventional, but integrated approach to general-purpose high-performance parallel computing. Rather than extending conventional sequential languages, Id, a high-level language with fine-grained parallelism and determinacy implicit in its operational semantics, is used. Id programs are compiled to dynamic dataflow graphs, which constitute a parallel machine language. Dataflow graphs are directly executed on the MIT tagged-token dataglow architecture (TTDA), a multiprocessor architecture. An overview of current thinking on dataflow architecture is provided by describing example Id programs, their compilation to dataflow graphs, and their execution on the TTDA. Related work and the status of the project are described. >

...read moreread less

508 citations

Proceedings Article•10.1145/349299.349320•

Exploiting superword level parallelism with multimedia instruction sets

[...]

Samuel Larsen¹, Saman Amarasinghe¹•Institutions (1)

Massachusetts Institute of Technology¹

1 May 2000

TL;DR: This paper has developed a simple and robust compiler for detecting SLPP that targets basic blocks rather than loop nests, and is able to exploit parallelism both across loop iterations and within basic blocks.

...read moreread less

Abstract: Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general purpose microprocessors. This added functionality comes primarily with the addition of short SIMD instructions. Unfortunately, access to these instructions is limited to in-line assembly and library calls. Generally, it has been assumed that vector compilers provide the most promising means of exploiting multimedia instructions. Although vectorization technology is well understood, it is inherently complex and fragile. In addition, it is incapable of locating SIMD-style parallelism within a basic block.In this paper we introduce the concept of Superword Level Parallelism (SLP) ,a novel way of viewing parallelism in multimedia and scientific applications. We believe SLPP is fundamentally different from the loop level parallelism exploited by traditional vector processing, and therefore demands a new method of extracting it. We have developed a simple and robust compiler for detecting SLPP that targets basic blocks rather than loop nests. As with techniques designed to extract ILP, ours is able to exploit parallelism both across loop iterations and within basic blocks. The result is an algorithm that provides excellent performance in several application domains. In our experiments, dynamic instruction counts were reduced by 46%. Speedups ranged from 1.24 to 6.70.

...read moreread less

458 citations

...

Expand

Year	Papers
2025	1
2023	2
2022	7
2021	5
2020	5
2019	6

Topic Tools

Papers published on a yearly basis

Papers

Real-Coded Genetic Algorithms and Interval-Schemata

Limits of instruction-level parallelism

Scheduler activations: effective kernel support for the user-level management of parallelism

Executing a program on the MIT tagged-token dataflow architecture

Exploiting superword level parallelism with multimedia instruction sets

Related Topics (5)

Performance Metrics