TL;DR: In this article, a file server architecture consisting of a network controller unit, a file controller unit and a storage processor unit is described, which operate in parallel with a local Unix host processor.
Abstract: A file server architecture is disclosed, comprising as separate processors, a network controller unit, a file controller unit and a storage processor unit. These units incorporate their own processors, and operate in parallel with a local Unix host processor. All networks are connected to the network controller unit, which performs all protocol processing up through the NFS layer. The virtual file system is implemented in the file control unit, and the storage processor provides high-speed multiplexed access to an array of mass storage devices. The file controller unit control file information caching through its own local cache buffer, and controls disk data caching through a large system memory which is accessible on a bus by any of the processors.
TL;DR: This work presents a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication, and includes convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys.
Abstract: The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.
TL;DR: In this article, the authors propose a method and means for reducing the storage requirement in the backup subsystem and further reducing the load on the transmission bandwidth where base files are maintained on the server in a segmented compressed format.
Abstract: In a client/server environment, a method and means for reducing the storage requirement in the backup subsystem and further reducing the load on the transmission bandwidth where base files are maintained on the server in a segmented compressed format. When a file is modified on the client, the file is transmitted to the server and compared with the segmented compressed base version of the file utilizing a differencing function but without decompressing the entire base file. A delta file which is the difference between the compressed base file and the modified version of the file is created and stored on a storage medium which is part of the backup subsystem. Alternatively, a copy of frequently accessed base files are maintained on the client in a compressed format. Whenever the client detects that a frequently accessed file has been modified, the modified version of the file is differenced against the base version of that file without decompressing the entire base file and a delta file is generated. The delta file is then transmitted to the server to be stored at the server for storage medium to be utilized either immediately or at a later time to update the base version of the modified file on the server.
TL;DR: In this paper, a plurality of data mover computers control access to respective file systems in data storage, and each of the data movers can access each file system by placing a lock on the file.
Abstract: A plurality of data mover computers control access to respective file systems in data storage A network client serviced by any of the data movers can access each of the file systems If a data mover receives a client request for access to a file in a file system to which access is controlled by another data mover, then the data mover that received the client request sends a metadata request to the data mover that controls access to the file system The data mover that controls access to the file system responds by placing a lock on the file and returning metadata of the file The data mover that received the client request uses the metadata to formulate a data access command that is used to access the file data in the file system over a bypass data path that bypasses the data mover computer that controls access to the file system
TL;DR: An archiving file system as mentioned in this paper automatically archives remote files across multiple types of secondary storage media (46, 48) on network data servers based on a set of hierarchically selectable archival attributes selectively assigned to each remote file.
Abstract: An archiving file system is specifically designed to support the storage of, and access to, remote files (42) stored on high speed, large capacity network data servers (14). The archiving file system automatically archives remote files (42) across multiple types of secondary storage media (46, 48) on such network data servers (14), based on a set of hierarchically selectable archival attributes selectively assigned to each remote file (42). The archiving file system is completely transparent to the user program (22) and operates on remote files (42), by providing a different file control program (40) and a different file structure on the network data server (14), without the need to modify the standard file system (24) that is native to a particular operating system program (20) executing on the user nodes (10) or the standard network file interfaces (34) executing on the distributed computer network environment (12).