TL;DR: Progress in managing the growth of this public cyberinfrastructure resource is described and the domain science that it has enabled is reviewed.
Abstract: The CIPRES Science Gateway (CSG) provides researchers and educators with browser-based access to community codes for inference of phylogenetic relationships from DNA and protein sequence data. The CSG allows users to deploy jobs on the high-performance computers of the TeraGrid without requiring detailed knowledge of their complexities. Use of the CSG has grown rapidly; through March 2011 it had more than 2,200 users and enabled more than 180 peer-reviewed publications. The rapid growth in resource consumption was accommodated by deploying codes on Trestles, a new TeraGrid computer. Tools and policies were developed to insure efficient and effective resource use. This paper describes progress in managing the growth of this public cyberinfrastructure resource and reviews the domain science that it has enabled.
TL;DR: The results indicate that the CSG is a critical and cost-effective enabler of science for phylogenetic researchers with limited resources, and is meeting an important need for computational resources in the Systematics/Evolutionary Biology community.
Abstract: The CIPRES Science Gateway (CSG) provides browser-based access to computationally demanding phylogenetic codes run on large HPC resources. Since its release in December 2009, there has been a sustained, near-linear growth in the rate of CSG use, both in terms of number of users submitting jobs each month and number of jobs submitted. The average amount of computational time used per month by CSG increased more than 5-fold since its initial release. As of April 2012, more than 4,000 unique users have run parallel tree inference jobs on TeraGrid/XSEDE resources using the CSG. The steady growth in resource use suggests that the CSG is meeting an important need for computational resources in the Systematics/Evolutionary Biology community.To ensure that XSEDE resources accessed through the CSG are used effectively, policies for resource consumption were developed, and an advanced set of management tools was implemented. Studies of usage trends show that these new management tools helped in distributing XSEDE resources across a large user population that has low-to-moderate computational needs.In the first quarter of 2012, 30% of all active XSEDE users accessed computational resources through the CSG, while the analyses conducted by these users accounted for 0.7% of all allocable XSEDE computational resources. User survey results showed that the easy access to XSEDE/TeraGrid resources through the CSG had a critical and measurable scientific impact: at least 300 scholarly publications spanning all major groups within the Tree of Life have been enabled by the CSG since 2009. The same users reported that 82% of these publications would not have been possible without access to computational resources available through the CSG. The results indicate that the CSG is a critical and cost-effective enabler of science for phylogenetic researchers with limited resources.
TL;DR: Montage as discussed by the authors is a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky, where re-projection jobs can be added to a pool of tasks and performed by as many processors as are available.
Abstract: This paper describes the design of a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky. All the re-projection jobs can be added to a pool of tasks and performed by as many processors as are available, exploiting the parallelization inherent in the Montage architecture. We show how we can describe the Montage application in terms of an abstract workflow so that a planning tool such as Pegasus can derive an executable workflow that can be run in the Grid environment. The execution of the workflow is performed by the workflow manager DAGMan and the associated Condor-G. The grid processing will support tiling of images to a manageable size when the input images can no longer be held in memory. Montage will ultimately run operationally on the Teragrid. We describe science applications of Montage, including its application to science product generation by Spitzer Legacy Program teams and large-scale, all-sky image processing projects.
TL;DR: The design of a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky, is described, including its application to science product generation by Spitzer Legacy Program teams and large-scale, all-sky image processing projects.
Abstract: This paper describes the design of a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky. All the re-projection jobs can be added to a pool of tasks and performed by as many processors as are available, exploiting the parallelization inherent in the Montage architecture. We show how we can describe the Montage application in terms of an abstract workflow so that a planning tool such as Pegasus can derive an executable workflow that can be run in the Grid environment. The execution of the workflow is performed by the workflow manager DAGMan and the associated Condor-G. The grid processing will support tiling of images to a manageable size when the input images can no longer be held in memory. Montage will ultimately run operationally on the Teragrid. We describe science applications of Montage, including its application to science product generation by Spitzer Legacy Program teams and large-scale, all-sky image processing projects.
TL;DR: Catlett will provide an overview of the TeraGrid architecture, key technologies including the software, clusters and optical network, and plans for ensuring that Tera grid reinforces and supports the global grid community.
Abstract: The TeraGrid is a $50 M collaborative project involving Argonne National Laboratory, the California Institute of Technology, the National Center for Supercomputing Applications, and the San Diego Supercomputer Center, funded by the U.S. National Science Foundation. Using Linux clusters at the four sites, interconnected with a 40 Gb/s wide area optical backplane, TeraGrid will provide a unique distributed resource with over 13 Teraflops of computing capability, nearly 1 Petabyte of online storage, and dedicated Teraflops clusters for visualization and data collection analysis. TeraGrid will be integrated using a suite of grid and middleware software technologies anchored by the Globus Toolkit, with a design objective toward an open, extensible architecture that can be expanded or duplicated, based on protocols and standards to promote interoperability. Catlett will provide an overview of the TeraGrid architecture, key technologies including the software, clusters and optical network, and plans for ensuring that TeraGrid reinforces and supports the global grid community.