About: BeeGFS is a research topic. Over the lifetime, 10 publications have been published within this topic receiving 23 citations. The topic is also known as: FhGFS.
TL;DR: SentiLog as mentioned in this paper uses a general sentimental, natural language model based on the logging-relevant source code collected from a set of PFSes to learn information embedded by developers from the source code.
Abstract: As core components of High-performance computing (HPC) platforms, parallel file systems (PFSes) grow quickly in scale and complexity, hence are subject to various failures and anomalies. Identifying their anomalies in runtime is critically helpful for HPC operators and administrators. Analyzing the runtime logs to detect the anomalies of large-scale systems has been proven effective in many recent studies. However, applying them to parallel file systems logs faces significant challenges due to the large volume and irregularity of PFSes logs. This study proposes SentiLog, a new approach to analyzing PFSes system logs for detecting anomalies. Unlike existing solutions, SentiLog works by training a general sentimental, natural language model based on the logging-relevant source code collected from a set of PFSes. In this way, SentiLog learns information embedded by developers from the source code. Our preliminary results show SentiLog is able to accurately predict anomalies and performs better than state-of-the-art log analysis solutions on two representative PFSes (Lustre and BeeGFS). This preliminary study shows sentiment analysis could be a promising method to analyze complex and irregular system logs.
TL;DR: A wide variety of tests narrow in on the optimal data transfer parameters for parallel data streaming across Internet2 and between two CloudLab clusters loading real genomics data onto a parallel file system.
TL;DR: The solution unifies data access for both the internal storage and external file systems using a uniform namespace, and improves storage performance by exploiting data locality across storage tiers, and increases data sharing between compute nodes and across applications.
Abstract: Modern high-performance computing (HPC) systems are increasingly using large amounts of fast storage, such as solid-state drives (SSD), to accelerate disk access times. This approach has been exemplified in the design of “burst buffers”, but more general caching systems have also been built. This paper proposes extending an existing parallel file system to provide such a file caching layer. The solution unifies data access for both the internal storage and external file systems using a uniform namespace. It improves storage performance by exploiting data locality across storage tiers, and increases data sharing between compute nodes and across applications. Leveraging data striping and meta-data partitioning, the system supports high speed parallel I/O for data intensive parallel computing. Data consistency across tiers is maintained automatically using a cache aware access algorithm. A prototype has been built using BeeGFS to demonstrate rapid access to an underlying IBM Spectrum Scale file system. Performance evaluation demonstrates a significant improvement in the efficiency over an external parallel file system.
TL;DR: This paper significantly decreased the amount of time spent saving data to disk, and analysis of the features used in relation to PnetCDF with BeeGFS I/O optimization is given.
Abstract: Large scale geophysical modeling uses high performance computing systems to expedite the solutions of very large, complex systems. High disk latencies, low IOPS, and low read/write data transfer rates are relegating many numerical simulations to I/O bound jobs, where the run time is bound not by CPU rate, but by I/O rate. In this paper we seek to improve the I/O of two geophysical modeling applications and take full advantage of the parallel nature of the programs, as well as the file management system for the large output files. Parallelizing output for these programs is achieved using PnetCDF, a parallel implementation of the netCDF format, and BeeGFS, an open-source parallel file system. Using these solutions, we have significantly decreased the amount of time spent saving data to disk, and give analysis of the features used in relation to PnetCDF with BeeGFS I/O optimization.
TL;DR: This work presents a simple solution for applications with very high I/O demands to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD).
Abstract: In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.