A Frequency Scaling based Performance Indicator Framework for Big Data Systems

Open AccessPosted Content

A Frequency Scaling based Performance Indicator Framework for Big Data Systems

- 27 Nov 2018

TL;DR: A novel indicator framework which can directly compare the impact of different indicators with each other is proposed to identify and analyze the performance bottleneck efficiently and a methodology which can construct the indicator from the performance change with the CPU frequency scaling is described.

Abstract: It is important for big data systems to identify their performance bottleneck. However, the popular indicators such as resource utilizations, are often misleading and incomparable with each other. In this paper, a novel indicator framework which can directly compare the impact of different indicators with each other is proposed to identify and analyze the performance bottleneck efficiently. A methodology which can construct the indicator from the performance change with the CPU frequency scaling is described. Spark is used as an example of a big data system and two typical SQL benchmarks are used as the workloads to evaluate the proposed method. Experimental results show that the proposed method is accurate compared with the resource utilization method and easy to implement compared with some white-box method. Meanwhile, the analysis with our indicators lead to some interesting findings and valuable performance optimization suggestions for big data systems.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

References

Journal Article•10.1145/1327452.1327492

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008

- Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

18.6K

Journal Article•10.1016/J.PARCO.2004.04.001

The ganglia distributed monitoring system: design, implementation, and experience

Matt Massie, +2 more

- 01 Jul 2004

TL;DR: The design, implementation, and evaluation of Ganglia are presented along with experience gained through real world deployments on systems of widely varying scale, configurations, and target application domains over the last two and a half years.

...read moreread less

1.5K

Journal Article•10.14778/1920841.1920902

Runtime measurements in the cloud: observing, analyzing, and reducing variance

Jörg Schad, +2 more

- 01 Sep 2010

TL;DR: A study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives using established microbenchmarks to measure performance variance in CPU, I/O, and network and a multi-node MapReduce application to quantify the impact on real dataintensive applications.

...read moreread less

743

Journal Article•10.14778/2831360.2831365

Clash of the titans: MapReduce vs. Spark for large scale data analytics

Juwei Shi, +6 more

- 01 Sep 2015

TL;DR: This paper evaluates the major architectural components in MapReduce and Spark frameworks including: shuffle, execution model, and caching, by using a set of important analytic workloads and shows that Map Reduce's execution model is more efficient for shuffling data than Spark, thus making Sort run faster on MapReduces.

...read moreread less

257

•Proceedings Article•10.5555/1182635.1164217

The making of TPC-DS

Raghunath Nambiar, +1 more

- 01 Sep 2006

TL;DR: The main characteristics of TPC-DS are described, why some of the key decisions were made and which performance aspects of decision support system it measures are explained.

...read moreread less

242