PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing
Kevin Huck,Allen D. Malony +1 more
- 12 Nov 2005
- pp 41-41
TL;DR: The framework architecture enables the development and integration of data mining operations that will be applied to large-scale parallel performance profiles and is built on a robust parallel performance database (PerfDMF) to access the parallel profiles and save its analysis results.
read more
Abstract: Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel performance attempt to capture these phenomena in measurement datasets rich with information relating multiple performance metrics to execution dynamics and parameters specific to the application-system experiment. However, the potential size of datasets and the need to assimilate results from multiple experiments makes it a daunting challenge to not only process the information, but discover and understand performance insights. In this paper, we present PerfExplorer, a framework for parallel performance data mining and knowledge discovery. The framework architecture enables the development and integration of data mining operations that will be applied to large-scale parallel performance profiles. PerfExplorer operates as a client-server system and is built on a robust parallel performance database (PerfDMF) to access the parallel profiles and save its analysis results. Examples are given demonstrating these techniques for performance analysis of ASCI applications.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Journal Article
When is nearest neighbor meaningful
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
1.9K
A Survey of Parallel Sequential Pattern Mining
TL;DR: An in-depth survey of the current status of parallel SPM (PSPM) is investigated and provided, including detailed categorization of traditional serial SPM approaches, and state-of-the art PSPM.
Patent
Frequent Pattern Mining
Shi Han,Yingnong Dang,Dongmei Zhang,Song Ge +3 more
- 27 Apr 2011
TL;DR: This comprehensive reference consists of 18 chapters from prominent researchers in the field of frequent pattern mining, and contains a survey describing key research on the topic, a case study and future directions.
129
Caliper: performance introspection for HPC software stacks
David Boehme,Todd Gamblin,David Beckingsale,Peer-Timo Bremer,Alfredo Gimenez,Matthew Legendre,Olga Pearce,Martin Schulz +7 more
- 13 Nov 2016
TL;DR: With Caliper, a general abstraction layer is developed to provide performance data collection as a service to applications, runtime systems, libraries, and tools that allows them to share performance data across software stack boundaries.
86
A Survey of Parallel Sequential Pattern Mining
TL;DR: In this paper, an in-depth survey of the current status of parallel sequential pattern mining (PSPM) is investigated and provided, including detailed categorization of traditional serial SPM approaches, and state of the art parallel SPM.
42
References
When Is ''Nearest Neighbor'' Meaningful?
Kevin S. Beyer,Jonathan Goldstein,Raghu Ramakrishnan,Uri Shaft +3 more
- 10 Jan 1999
TL;DR: The effect of dimensionality on the "nearest neighbor" problem is explored, and it is shown that under a broad set of conditions, as dimensionality increases, the Distance to the nearest data point approaches the distance to the farthest data point.
•Journal Article
When is nearest neighbor meaningful
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
1.9K
Automatically characterizing large scale program behavior
Timothy Sherwood,Erez Perelman,Greg Hamerly,Brad Calder +3 more
- 01 Oct 2002
TL;DR: This work quantifies the effectiveness of Basic Block Vectors in capturing program behavior across several different architectural metrics, explores the large scale behavior of several programs, and develops a set of algorithms based on clustering capable of analyzing this behavior.
The Tau Parallel Performance System
Sameer Shende,Allen D. Malony +1 more
- 01 May 2006
TL;DR: This paper presents the TAU (Tuning and Analysis Utilities) parallel performance sytem and describes how it addresses diverse requirements for performance observation and analysis.
•Posted Content
Experiments with Random Projection
TL;DR: Results of random projection as a promising dimensionality reduction technique for learning mixtures of Gaussians are summarized by a wide variety of experiments on synthetic and real data.
330