Understanding collaborative studies through interoperable workflow provenance
Ilkay Altintas,Ilkay Altintas,Manish Kumar Anand,Daniel Crawl,Shawn Bowers,Adam Belloum,Paolo Missier,Bertram Ludäscher,Carole Goble,Peter M. A. Sloot +9 more
- 15 Jun 2010
- Vol. 6378, pp 42-58
TL;DR: This paper describes a new query model that captures implicit user collaborations and shows how this model maps to OPM and helps to answer collaborative queries, e.g., identifying combined workflows and contributions of users collaborating on a project based on the records of previous workflow executions.
read more
Abstract: The provenance of a data product contains information about how the product was derived, and is crucial for enabling scientists to easily understand, reproduce, and verify scientific results. Currently, most provenance models are designed to capture the provenance related to a single run, and mostly executed by a single user. However, a scientific discovery is often the result of methodical execution of many scientific workflows with many datasets produced at different times by one or more users. Further, to promote and facilitate exchange of information between multiple workflow systems supporting provenance, the Open Provenance Model (OPM) has been proposed by the scientific workflow community. In this paper, we describe a new query model that captures implicit user collaborations. We show how this model maps to OPM and helps to answer collaborative queries, e.g., identifying combined workflows and contributions of users collaborating on a project based on the records of previous workflow executions. We also adopt and extend the high-level Query Language for Provenance (QLP) with additional constructs, and show how these extensions allow non-expert users to express collaborative provenance queries against this model easily and concisely. Furthermore, we adopt the Provenance Challenge 3 (PC3) workflows as a collaborative and interoperable usecase scenario, where different stages of the workflow are executed in three different workflow environments - Kepler, Taverna, and WSVLAM. Through this usecase, we demonstrate how we can establish and understand collaborative studies through interoperable workflow provenance.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Achieving reproducibility by combining provenance with service and workflow versioning
Simon Woodman,Hugo Hiden,Paul Watson,Paolo Missier +3 more
- 14 Nov 2011
TL;DR: This paper shows how the range of useful questions that provenance can answer is greatly increased when it is encapsulated into a system that can store and execute both current and old versions of workflows and services.
•Dissertation
Trusting Crowdsourced Information on Cultural Artefacts
Archana Nottamkandath
- 29 Mar 2016
36
Diagnosing Machine Learning Pipelines with Fine-grained Lineage
Zhao Zhang,Evan R. Sparks,Michael J. Franklin +2 more
- 26 Jun 2017
TL;DR: Hippo efficiently enables common ML diagnosis operations such as code debugging, result analysis, data anomaly removal, and computation replay, and observes an O(10^3)x total improvement in lineage storage efficiency vs. the baseline of cell-wise mapping recording.
29
Network analysis on provenance graphs from a crowdsourcing application
Mark Ebden,Trung Dong Huynh,Luc Moreau,Sarvapali D. Ramchurn,Stephen J. Roberts +4 more
- 19 Jun 2012
TL;DR: The results indicate that provenance graphs represent a suitable area of exploitation for existing network analysis tools concerned with modelling, prediction, and the inference of missing nodes and edges.
•Proceedings Article
Interpretation of Crowdsourced Activities Using Provenance Network Analysis
Trung Dong Huynh,Mark Ebden,Matteo Venanzi,Sarvapali D. Ramchurn,Stephen J. Roberts,Luc Moreau +5 more
- 03 Nov 2013
TL;DR: This paper presents an application-independent methodology for analysing provenance graphs, constructed from provenance records, to learn about such patterns and to use them for assessing some key properties of crowdsourced data, such as their quality, in an automated manner.
References
Scientific Workflow Management and the Kepler System
Bertram Ludäscher,Bertram Ludäscher,Ilkay Altintas,Chad Berkley,Dan Higgins,Efrat Jaeger,Matthew B. Jones,Edward A. Lee,Jing Tao,Yang Zhao +9 more
TL;DR: Kepler as mentioned in this paper is a scientific workflow system, which is currently under development across a number of scientific data management projects and is a community-driven, open source project, and always welcome related projects and new contributors to join.
A survey of data provenance in e-science
Yogesh Simmhan,Beth Plale,Dennis Gannon +2 more
- 01 Sep 2005
TL;DR: The main aspect of the taxonomy categorizes provenance systems based on why they record provenance, what they describe, how they represent and storeprovenance, and ways to disseminate it.
1.3K
Scientific workflow management and the Kepler system: Research Articles
Bertram Ludäscher,Ilkay Altintas,Chad Berkley,Dan Higgins,Efrat Jaeger,Matthew B. Jones,Edward A. Lee,Jing Tao,Yang Zhao +8 more
TL;DR: Characteristics of and requirements for scientific workflows as identified in a number of application projects are described, and some key features of Kepler and its underlying Ptolemy II system, planned extensions, and areas of future research are described.
1.2K
Workflows and e-Science: An overview of workflow system features and capabilities
TL;DR: The taxonomy provides end users with a mechanism by which they can assess the suitability of workflow in general and how they might use these features to make an informed choice about which workflow system would be a good choice for their particular application.
999
Workflows for e-Science
Ian Taylor,Ewa Deelman,Dennis Gannon,Matthew Shields +3 more
- 01 Jan 2007
TL;DR: In this article, the authors present an overview of the current state-of-the-art within established projects, presenting many different aspects of workflow from users to tool builders, from a number of different perspectives.
831
Related Papers (5)
Juliana Freire,David Koop,Emanuele Santos,Cláudio T. Silva +3 more
- 01 May 2008
Susan B. Davidson,Juliana Freire +1 more
- 09 Jun 2008
Yogesh Simmhan,Beth Plale,Dennis Gannon +2 more
- 01 Sep 2005