Proceedings Article10.1109/ESCIENCE.2014.44
Community Resources for Enabling Research in Distributed Scientific Workflows
Rafael Ferreira da Silva,Weiwei Chen,Gideon Juve,Karan Vahi,Ewa Deelman +4 more
- 20 Oct 2014
- Vol. 1, pp 177-184
90
TL;DR: This paper describes how a collection of tools and data that have enabled research in new techniques, algorithms, and systems for scientific workflows are used to investigate new techniques for efficient and robust workflow execution, as well as to provide improvements to the Pegasus Workflow Management System or other workflow tools.
read more
Abstract: A significant amount of recent research in scientific workflows aims to develop new techniques, algorithms and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. Since the infrastructures, systems and applications are complex, and their behavior is difficult to reproduce using physical experiments, much of this research is based on simulation. However, there exists a shortage of realistic datasets and tools that can be used for such studies. In this paper we describe a collection of tools and data that have enabled research in new techniques, algorithms, and systems for scientific workflows. These resources include: 1) execution traces of real workflow applications from which workflow and system characteristics such as resource usage and failure profiles can be extracted, 2) a synthetic workflow generator that can produce realistic synthetic workflows based on profiles extracted from execution traces, and 3) a simulator framework that can simulate the execution of synthetic workflows on realistic distributed infrastructures. This paper describes how we have used these resources to investigate new techniques for efficient and robust workflow execution, as well as to provide improvements to the Pegasus Workflow Management System or other workflow tools. Our goal in describing these resources is to share them with other researchers in the workflow research community. All of the tools and data are freely available online for the community at http://www.workflowarchive.org. These data have already been leveraged for a number of studies.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds
Maciej Malawski,Gideon Juve,Ewa Deelman,Jarek Nabrzyski +3 more
- 10 Nov 2012
TL;DR: It is found that the key factor determining the performance of an algorithm is its ability to decide which workflows in an ensemble to admit or reject for execution, and an admission procedure based on workflow structure and estimates of task runtimes can significantly improve the quality of solutions.
312
Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds
TL;DR: It is found that the key factor determining the performance of an algorithm is its ability to decide which workflows in an ensemble to admit or reject for execution, and an admission procedure based on workflow structure and estimates of task runtimes can significantly improve the quality of solutions.
237
A scheduling scheme in the cloud computing environment using deep Q-learning
TL;DR: A novel artificial intelligence algorithm, called deep Q-learning task scheduling (DQTS), that combines the advantages of the Q- learning algorithm and a deep neural network is proposed, aimed at solving the problem of handling directed acyclic graph tasks in a cloud computing environment.
217
Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms
TL;DR: This is the first approach that explicitly addresses VM sharing in the context of WaaS by modeling the use of containers in the resource provisioning and scheduling heuristics and results demonstrate its responsiveness to environmental uncertainties, its ability to meet deadlines, and its cost-efficiency when compared to a state-of-the-art algorithm.
104
Developing accurate and scalable simulators of production workflow management systems with WRENCH
Henri Casanova,Rafael Ferreira da Silva,Ryan Tanaka,Ryan Tanaka,Suraj Pandey,Gautam Jethwani,William Koch,Spencer Albrecht,James Oeth,Frédéric Suter +9 more
TL;DR: WRENCH, a WMS simulation framework, whose objectives are accurate and scalable simulations; and easy simulation software development is presented, to determine to which extent WRENCH achieves its objectives.
66
References
CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms
TL;DR: The result of this case study proves that the federated Cloud computing model significantly improves the application QoS requirements under fluctuating resource and service demand patterns.
5.3K
Performance-effective and low-complexity task scheduling for heterogeneous computing
TL;DR: Two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time are presented, called the Heterogeneous Earliest-Finish-Time (HEFT) algorithm and the Critical-Path-on-a-Processor (CPOP) algorithm.
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Ewa Deelman,Gurmeet Singh,Mei-Hui Su,Jim Blythe,Yolanda Gil,Carl Kesselman,Gaurang Mehta,Karan Vahi,G. Bruce Berriman,John C. Good,Anastasia C. Laity,Joseph C. Jacob,Daniel S. Katz +12 more
TL;DR: The results of improving application performance through workflow restructuring which clusters multiple tasks in a workflow into single entities are presented.
Workflows and e-Science: An overview of workflow system features and capabilities
TL;DR: The taxonomy provides end users with a mechanism by which they can assess the suitability of workflow in general and how they might use these features to make an informed choice about which workflow system would be a good choice for their particular application.
999
Pegasus, a workflow management system for science automation
Ewa Deelman,Karan Vahi,Gideon Juve,Mats Rynge,S. Callaghan,Philip Maechling,Rajiv Mayani,Weiwei Chen,Rafael Ferreira da Silva,Miron Livny,Kent Wenger +10 more
TL;DR: An integrated view of the Pegasus system is provided, showing its capabilities that have been developed over time in response to application needs and to the evolution of the scientific computing platforms.
898