Open AccessProceedings Article
Proceedings of the 2nd workshop on Workflows in support of large-scale science
Ewa Deelman,Ian Taylor +1 more
- 25 Jun 2007
15
TL;DR: With the drastic increase of raw data volume in many domains, their role is important to assist scientists in organizing and processing their data and leverage High-Performance or High-Throughput computing resources.
read more
Abstract: Welcome to the 2nd Workshop on Workflows in Support of Large-Scale Science (WORKS'07). Since the 1st WORKS workshop, there has been a growing interest in the workflow technologies and workflows are still considered a key technology to enable large-scale science applications. Workflows enable scientists to design complex applications that are composed of individual application components or services. Often times these components and services are designed, developed, and tested collaboratively. Because of the size of the data and the complexity of the analysis, large amounts of shared resources such as clusters and storage systems are being used to store the data sets and execute the workflows. The process of workflow design and execution in a distributed environment can be very complex and involve mapping high-level workflow descriptions onto the available resources, as well as monitoring and debugging of the subsequent execution. Because computations and data access operations are performed on shared resources, there is an increased interest in managing the fair allocation and management of those resources at the workflow level.
Adequate workflow descriptions are needed to support the complex workflow management process that includes workflow creation, reuse, and modifications made to the workflow over time---for example modifications to the individual components. Additional annotations may provide guidelines and requirements for resource mapping and execution.
Large-scale scientific applications impose requirements on the workflow systems. Besides the magnitude of data processed by the workflow components, the resulting and intermediate data need to be annotated with provenance information and any other information needed to evaluate the quality of the data and support the repeatability of the analysis.
The Workshop on Workflows in Support of Large-Scale Science focuses on the entire workflow lifecycle including the workflow composition, mapping, and robust execution, as well as workflow applications.
During the 2nd WORKS meeting, papers spanning a range of workflow topics will be presented. Among them are: real-time workflow systems, graphical workflow composition, distributed workflow caching in P2P environments, workflow automation, workflow-based applications and semantic authoring tools.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis
Derik Barseghian,Ilkay Altintas,Matthew B. Jones,Daniel Crawl,Nathan Potter,James Gallagher,Peter Cornillon,Mark Schildhauer,Elizabeth T. Borer,Eric W. Seabloom,Parviez R. Hosseini +10 more
TL;DR: This work describes workflows and extensions to Kepler to stream and analyze data from observatory networks and archives, and focuses on the use of two newly integrated data sources in Kepler: DataTurbine and OPeNDAP.
113
Data curation + process curation=data integration + science
TL;DR: This article will brief the community on the current state of the art and the current challenges for process curation, both within and without the Life Sciences.
74
Performance evaluation of fault tolerance techniques in grid computing system
TL;DR: A performance evaluation of most commonly used FTTs in grid computing system shows that the workflow level alternative task techniques have performance priority on task level checkpointing techniques.
44
Solutions for data integration in functional genomics: a critical assessment and case study
Damian Smedley,Morris A. Swertz,Katy Wolstencroft,Glenn Proctor,Michael Zouberakis,Jonathan Bard,John M. Hancock,Paul N Schofield +7 more
TL;DR: An automated approach involving an in silico experimental workflow tool, Taverna, is developed using web services, BioMart and MOLGENIS technologies for data retrieval, and focuses on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.
Real-time Grid monitoring based on complex event processing
TL;DR: It is concluded that real-time Grid monitoring is possible without excessive intrusiveness for resources and network and that CEP enables us to achieve two goals, otherwise difficult to combine: real- time access to monitoring data and advanced query capabilities.
35
Related Papers (5)
Ewa Deelman,Jim Blythe,Yolanda Gil,Carl Kesselman +3 more
- 01 Jan 2004
Johann Eder,Herbert Groiss,Walter Liebhart +2 more
- 01 Mar 1996
Ustun Yildiz,Adnene Guabtni,Anne H. H. Ngu +2 more
- 06 Jul 2009