Using the Cloud for parameter estimation problems: comparing Spark vs MPI with a case-study

doi:10.1109/CCGRID.2017.58

Proceedings Article10.1109/CCGRID.2017.58

Using the Cloud for parameter estimation problems: comparing Spark vs MPI with a case-study

Patricia González, +5 more

- 14 May 2017

- pp 797-806

16

TL;DR: Two distributed computing models are explored and compared: the MPI (message-passing interface) model, that is high-performance oriented, and the Spark model, which is throughput oriented but outperforms other cloud programming solutions adding improved support for iterative algorithms through in-memory computing.

Abstract: Systems biology is an emerging approach focused in generating new knowledge about complex biological systems by combining experimental data with mathematical modeling and advanced computational techniques. Many problems in this field are extremely challenging and require substantial supercomputing resources to be solved. This is the case of parameter estimation in large-scale nonlinear dynamic systems biology models. Recently, Cloud Computing has emerged as a new paradigm for on-demand delivery of computing resources. However, scientific computing community has been quite hesitant in using the Cloud, simply because traditional programming models do not fit well with the new paradigm, and the earliest cloud programming models do not allow most scientific computations being efficiently run in the Cloud. In this paper we explore and compare two distributed computing models: the MPI (message-passing interface) model, that is high-performance oriented, and the Spark model, which is throughput oriented but outperforms other cloud programming solutions adding improved support for iterative algorithms through in-memory computing. The performance of a very well known metaheuristic, the Differential Evolution algorithm, has been thoroughly assessed using a challenging parameter estimation problem from the domain of computational systems biology. The experiments have been carried out both in a local cluster and in the Microsoft Azure public cloud, allowing performance and cost evaluation for both infrastructures.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.13039/501100003329

Evaluation of indexes for the quantitative and objective estimation of grapevine bunch compactness

Javier Tello, +1 more

- 01 Jan 2014

- Vitis: Journal of Grapevine Research

TL;DR: In this paper, the authors used the visual OIV descriptor No 204 to classify 110 grape bunches of different morphology, from 11 different varieties, according to a panel of 14 judges.

...read moreread less

70

•Journal Article•10.1016/J.FUTURE.2018.04.030

BDEv 3.0: energy efficiency and microarchitectural characterization of Big Data processing frameworks

Jorge Veiga, +3 more

- 01 Sep 2018

- Future Generation Computer Systems

TL;DR: This work discusses the current state of the art in evaluating distributed processing frameworks, while extending the Big Data Evaluator tool (BDEv) to extract energy efficiency and microarchitecture-level metrics from the execution of representative Big Data workloads.

...read moreread less

26

Proceedings Article•10.1109/WEMDCD51469.2021.9425641

SuMRAS: a new SPMSM Parameter Identification in Cloud Computing Environment

Donatello Costantino, +5 more

- 08 Apr 2021

TL;DR: In this article, an innovative Supervised model Reference Adaptive System (SuMRAS) is proposed for rotor flux linkage identification in SPMSM. The proposed SuMRAS can be applied also to operating motors without requiring manual analysis, hence, it is suitable for large-scale implementation in cloud services.

...read moreread less

14

Proceedings Article•10.1109/CCGRID.2018.00040

Addressing the challenges of executing a massive computational cluster in the cloud

Brandon Posey, +6 more

- 01 May 2018

TL;DR: This work dynamically provision a large scale high performance computing cluster of more than one million cores utilizing Amazon Web Services (AWS) and utilizes it to study a parameter sweep workflow composed of message-passing parallel topic modeling jobs on multiple datasets.

...read moreread less

9

Performance Evaluation of Big Data Analysis.

Jorge Veiga, +2 more

- 01 Jan 2019

7

...

Expand

References

Journal Article•10.1023/A:1008202821328

Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces

Rainer Storn, +1 more

- 01 Dec 1997

- Journal of Global Optimization

TL;DR: In this article, a new heuristic approach for minimizing possibly nonlinear and non-differentiable continuous space functions is presented, which requires few control variables, is robust, easy to use, and lends itself very well to parallel computation.

...read moreread less

28.1K

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

Journal Article•10.1109/TEVC.2010.2059031

Differential Evolution: A Survey of the State-of-the-Art

Swagatam Das, +1 more

- 01 Feb 2011

- IEEE Transactions on Evolutionary Comput...

TL;DR: A detailed review of the basic concepts of DE and a survey of its major variants, its application to multiobjective, constrained, large scale, and uncertain optimization problems, and the theoretical studies conducted on DE so far are presented.

...read moreread less

5.2K

•Proceedings Article

Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing

Matei Zaharia, +8 more

- 25 Apr 2012

TL;DR: Resilient Distributed Datasets is presented, a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner and is implemented in a system called Spark, which is evaluated through a variety of user applications and benchmarks.

...read moreread less

4.6K

Journal Article•10.1016/J.SWEVO.2016.01.004

Recent advances in differential evolution – An updated survey

Swagatam Das, +2 more

- 01 Apr 2016

- Swarm and evolutionary computation

TL;DR: It is found that it is a high time to provide a critical review of the latest literatures published and also to point out some important future avenues of research on DE.

...read moreread less

1.5K