High-throughput computing

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1109/GRID.2004.14•

BOINC: A System for Public-Resource Computing and Storage

[...]

Dustin Anderson¹•Institutions (1)

University of California, Berkeley¹

8 Nov 2004

TL;DR: The goals of BOINC are described, the design issues that were confronted, and the solutions to these problems are described.

...read moreread less

Abstract: BOINC (Berkeley Open Infrastructure for Network Computing) is a software system that makes it easy for scientists to create and operate public-resource computing projects. It supports diverse applications, including those with large storage or communication requirements. PC owners can participate in multiple BOINC projects, and can specify how their resources are allocated among these projects. We describe the goals of BOINC, the design issues that we confronted, and our solutions to these problems.

...read moreread less

2,221 citations

Journal Article•10.1002/CPE.938•

Distributed computing in practice: the Condor experience

[...]

Douglas Thain¹, Todd Tannenbaum¹, Miron Livny¹•Institutions (1)

University of Wisconsin-Madison¹

01 Feb 2005-Concurrency and Computation: Practice and Experience

TL;DR: The history and philosophy of the Condor project is provided and how it has interacted with other projects and evolved along with the field of distributed computing is described.

...read moreread less

Abstract: SUMMARY Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course travelled by research ideas as they grow into production systems. Copyright c � 2005 John Wiley & Sons, Ltd.

...read moreread less

2,170 citations

Journal Article•10.1002/CPE.3505•

FireWorks: a dynamic workflow system designed for high-throughput applications

[...]

Anubhav Jain¹, Shyue Ping Ong², Wei Chen¹, Bharat Medasani¹, Xiaohui Qu¹, Michael Kocher¹, Miriam Brafman¹, Guido Petretto³, Gian-Marco Rignanese³, Geoffroy Hautier³, Dan Gunter¹, Kristin A. Persson¹ - Show less +8 more•Institutions (3)

Lawrence Berkeley National Laboratory¹, University of California, San Diego², Université catholique de Louvain³

10 Dec 2015-Concurrency and Computation: Practice and Experience

TL;DR: FireWorks has been used to complete over 50 million CPU‐hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center, and its implementation strategy that rests on Python and NoSQL databases (MongoDB) is discussed.

...read moreread less

Abstract: This paper introduces FireWorks, a workflow software for running high-throughput calculation workflows at supercomputing centers. FireWorks has been used to complete over 50 million CPU-hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center. It has been designed to serve the demanding high-throughput computing needs of these applications, with extensive support for i concurrent execution through job packing, ii failure detection and correction, iii provenance and reporting for long-running projects, iv automated duplicate detection, and v dynamic workflows i.e., modifying the workflow graph during runtime. We have found that these features are highly relevant to enabling modern data-driven and high-throughput science applications, and we discuss our implementation strategy that rests on Python and NoSQL databases MongoDB. Finally, we present performance data and limitations of our approach along with planned future work. Copyright © 2015 John Wiley & Sons, Ltd.

...read moreread less

558 citations

Journal Article•10.3389/FMOLB.2021.729513•

Structural Biology in the Clouds: The WeNMR-EOSC Ecosystem

[...]

Rodrigo V. Honorato¹, Panagiotis I. Koukos¹, Brian Jiménez-García¹, Andrei Tsaregorodtsev², M. Verlato, Andrea Giachetti³, Antonio Rosato³, Alexandre M. J. J. Bonvin¹ - Show less +4 more•Institutions (3)

Utrecht University¹, Aix-Marseille University², University of Florence³

28 Jul 2021-Frontiers in Molecular Biosciences

TL;DR: The European Open Science Cloud (EOSC) portal has been used for the WeNMR project as mentioned in this paper since 2010 and has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field.

...read moreread less

Abstract: Structural biology aims at characterizing the structural and dynamic properties of biological macromolecules with atomic details. Gaining insight into three dimensional structures of biomolecules and their interactions is critical for understanding the vast majority of cellular processes, with direct applications in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the high throughput computing infrastructure provided by EGI. These services have been further developed in subsequent initiatives under H2020 projects and are now operating as Thematic Services in the European Open Science Cloud (EOSC) portal (www.eosc-portal.eu), sending >12 millions of jobs and using around 4000 CPU-years per year. Here we review 10 years of successful e-infrastructure solutions serving a large worldwide community of over 23,000 users to date, providing them with user-friendly, web-based solutions that run complex workflows in structural biology. The current set of active WeNMR portals are described, together with the complex backend machinery that allows distributed computing resources to be harvested efficiently.

...read moreread less

493 citations

Proceedings Article•10.1109/MTAGS.2008.4777912•

Many-task computing for grids and supercomputers

[...]

Ioan Raicu¹, Ian Foster¹, Yong Zhao²•Institutions (2)

University of Chicago¹, Microsoft²

1 Nov 2008

TL;DR: Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing, drawing attention to the many computations that are heterogeneous but not ldquohappilyrdquo parallel.

...read moreread less

Abstract: Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many task computing differs from high throughput computing in the emphasis of using large number of computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where primary metrics are measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. Many task computing denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Many task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in high performance computing, drawing attention to the many computations that are heterogeneous but not ldquohappilyrdquo parallel.

...read moreread less

320 citations

...

Expand

Topic Tools

Papers published on a yearly basis

Papers

BOINC: A System for Public-Resource Computing and Storage

Distributed computing in practice: the Condor experience

FireWorks: a dynamic workflow system designed for high-throughput applications

Structural Biology in the Clouds: The WeNMR-EOSC Ecosystem

Many-task computing for grids and supercomputers

Related Topics (5)

Performance Metrics

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	6
2019	14
2018	17
2017	21
2016	15