BeeGFS

Topic Tools

Papers

Proceedings Article•10.1145/3465332.3470873•

SentiLog: Anomaly Detecting on Parallel File Systems via Log-based Sentiment Analysis

[...]

Di Zhang¹, Dong Dai¹, Runzhou Han², Mai Zheng²•Institutions (2)

University of North Carolina at Charlotte¹, Iowa State University²

27 Jul 2021

TL;DR: SentiLog as mentioned in this paper uses a general sentimental, natural language model based on the logging-relevant source code collected from a set of PFSes to learn information embedded by developers from the source code.

...read moreread less

Abstract: As core components of High-performance computing (HPC) platforms, parallel file systems (PFSes) grow quickly in scale and complexity, hence are subject to various failures and anomalies. Identifying their anomalies in runtime is critically helpful for HPC operators and administrators. Analyzing the runtime logs to detect the anomalies of large-scale systems has been proven effective in many recent studies. However, applying them to parallel file systems logs faces significant challenges due to the large volume and irregularity of PFSes logs. This study proposes SentiLog, a new approach to analyzing PFSes system logs for detecting anomalies. Unlike existing solutions, SentiLog works by training a general sentimental, natural language model based on the logging-relevant source code collected from a set of PFSes. In this way, SentiLog learns information embedded by developers from the source code. Our preliminary results show SentiLog is able to accurately predict anomalies and performs better than state-of-the-art log analysis solutions on two representative PFSes (Lustre and BeeGFS). This preliminary study shows sentiment analysis could be a promising method to analyze complex and irregular system logs.

...read moreread less

21 citations

Journal Article•10.1016/J.FUTURE.2017.04.030•

Maximizing the performance of scientific data transfer by optimizing the interface between parallel file systems and advanced research networks

[...]

Nicholas Mills¹, F. Alex Feltus¹, Walter B. Ligon¹•Institutions (1)

Clemson University¹

01 Feb 2018-Future Generation Computer Systems

TL;DR: A wide variety of tests narrow in on the optimal data transfer parameters for parallel data streaming across Internet2 and between two CloudLab clusters loading real genomics data onto a parallel file system.

...read moreread less

15 citations

Book Chapter•10.1007/978-3-030-48842-0_1•

A BeeGFS-Based Caching File System for Data-Intensive Parallel Computing

[...]

David Abramson¹, Chao Jin¹, Justin Luong¹, Jake Carroll¹•Institutions (1)

University of Queensland¹

24 Feb 2020

TL;DR: The solution unifies data access for both the internal storage and external file systems using a uniform namespace, and improves storage performance by exploiting data locality across storage tiers, and increases data sharing between compute nodes and across applications.

...read moreread less

Abstract: Modern high-performance computing (HPC) systems are increasingly using large amounts of fast storage, such as solid-state drives (SSD), to accelerate disk access times. This approach has been exemplified in the design of “burst buffers”, but more general caching systems have also been built. This paper proposes extending an existing parallel file system to provide such a file caching layer. The solution unifies data access for both the internal storage and external file systems using a uniform namespace. It improves storage performance by exploiting data locality across storage tiers, and increases data sharing between compute nodes and across applications. Leveraging data striping and meta-data partitioning, the system supports high speed parallel I/O for data intensive parallel computing. Data consistency across tiers is maintained automatically using a cache aware access algorithm. A prototype has been built using BeeGFS to demonstrate rapid access to an underlying IBM Spectrum Scale file system. Performance evaluation demonstrates a significant improvement in the efficiency over an external parallel file system.

...read moreread less

10 citations

Journal Article•10.1016/J.PARCO.2021.102786•

Improving the I/O of large geophysical models using PnetCDF and BeeGFS

[...]

Jared Brzenski¹, Christopher Paolini¹, Jose E. Castillo¹•Institutions (1)

San Diego State University¹

1 Jul 2021

TL;DR: This paper significantly decreased the amount of time spent saving data to disk, and analysis of the features used in relation to PnetCDF with BeeGFS I/O optimization is given.

...read moreread less

Abstract: Large scale geophysical modeling uses high performance computing systems to expedite the solutions of very large, complex systems. High disk latencies, low IOPS, and low read/write data transfer rates are relegating many numerical simulations to I/O bound jobs, where the run time is bound not by CPU rate, but by I/O rate. In this paper we seek to improve the I/O of two geophysical modeling applications and take full advantage of the parallel nature of the programs, as well as the file management system for the large output files. Parallelizing output for these programs is achieved using PnetCDF, a parallel implementation of the netCDF format, and BeeGFS, an open-source parallel file system. Using these solutions, we have significantly decreased the amount of time spent saving data to disk, and give analysis of the features used in relation to PnetCDF with BeeGFS I/O optimization.

...read moreread less

6 citations

Proceedings Article•10.1109/HPCS48598.2019.9188216•

Using On-Demand File Systems in HPC Environments

[...]

Mehmet Soysal¹, Marco Berghoff¹, Thorsten Zirwes¹, Marc-André Vef², Sebastian Oeste, André Brinkmann¹, Wolfgang E. Nagel¹, Achim Streit¹ - Show less +4 more•Institutions (2)

Karlsruhe Institute of Technology¹, University of Mainz²

15 Jul 2019

TL;DR: This work presents a simple solution for applications with very high I/O demands to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD).

...read moreread less

Abstract: In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.

...read moreread less

5 citations

Topic Tools

Papers

SentiLog: Anomaly Detecting on Parallel File Systems via Log-based Sentiment Analysis

Maximizing the performance of scientific data transfer by optimizing the interface between parallel file systems and advanced research networks

A BeeGFS-Based Caching File System for Data-Intensive Parallel Computing

Improving the I/O of large geophysical models using PnetCDF and BeeGFS

Using On-Demand File Systems in HPC Environments

Related Topics (5)

Performance Metrics