TL;DR: For concurrent programming to become mainstream, threads must be discarded as a programming model, and nondeterminism should be judiciously and carefully introduced where needed, and it should be explicit in programs.
Abstract: For concurrent programming to become mainstream, we must discard threads as a programming model. Nondeterminism should be judiciously and carefully introduced where needed, and it should be explicit in programs. In general-purpose software engineering practice, we have reached a point where one approach to concurrent programming dominates all others namely, threads, sequential processes that share memory. They represent a key concurrency model supported by modern computers, programming languages, and operating systems. In scientific computing, where performance requirements have long demanded concurrent programming, data-parallel language extensions and message-passing libraries such as PVM, MPI, and OpenMP dominate over threads for concurrent programming. Computer architectures intended for scientific computing often differ significantly from so-called general-purpose architectures.
TL;DR: This book offers an in-depth description of the IEEE operating system interface standard, POSIXAE (Portable Operating System Interface) threads, commonly called Pthreads, and explains basic concepts such as asynchronous programming, the lifecycle of a thread, and synchronization.
Abstract: With this practical book, you will attain a solid understanding of threads and will discover how to put this powerful mode of programming to work in real-world applications.The primary advantage of threaded programming is that it enables your applications to accomplish more than one task at the same time by using the number-crunching power of multiprocessor parallelism and by automatically exploiting I/O concurrency in your code, even on a single processor machine. The result: applications that are faster, more responsive to users, and often easier to maintain. Threaded programming is particularly well suited to network programming where it helps alleviate the bottleneck of slow network I/O.This book offers an in-depth description of the IEEE operating system interface standard, POSIXAE (Portable Operating System Interface) threads, commonly called Pthreads. Written for experienced C programmers, but assuming no previous knowledge of threads, the book explains basic concepts such as asynchronous programming, the lifecycle of a thread, and synchronization. You then move to more advanced topics such as attributes objects, thread-specific data, and realtime scheduling. An entire chapter is devoted to "real code," with a look at barriers, read/write locks, the work queue manager, and how to utilize existing libraries. In addition, the book tackles one of the thorniest problems faced by thread programmers-debugging-with valuable suggestions on how to avoid code errors and performance problems from the outset.Numerous annotated examples are used to illustrate real-world concepts. A Pthreads mini-reference and a look at future standardization are also included.
TL;DR: It is argued that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations, and that managing parallelism at the user level is essential to high-performance parallel computing.
Abstract: Threads are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead.Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; we thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-level threads is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; we thus argue that kernel threads or processes, as currently conceived, are the wrong abstraction on which to support user-level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism.
TL;DR: In this paper, the authors argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; managing parallelism at the user level is essential to high-performance parallel computing.
Abstract: Threads are the vehicle for concurrency in many approaches to parallel programming. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory.This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the problems encountered in integrating user-level threads with other system services is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; kernel threads are the wrong abstraction on which to support user-level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism.
TL;DR: A model to evaluate the performance of different combinations of synchronization mechanisms and scheduling policies leads to the conclusion that gang scheduling is required for efficient fine-grain synchronization on multiprogrammed multiprocessors.