Data file

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Book•

SPSS survival manual : a step by step guide to data analysis using SPSS

[...]

Julie Pallant

1 Jan 2010

TL;DR: In this paper, SPSS is used to explore relationships among variables using graphs to describe and explore the data, checking the reliability of a scale, choosing the right statistic, and comparing groups.

...read moreread less

Abstract: Preface Data files and website Introduction and overview Part One: Getting started Designing a study Preparing a codebook Getting to know SPSS Part Two: Preparing the data file Creating a data file and entering data Screening and cleaning the data Part Three: Preliminary analyses Descriptive statistics Using graphs to describe and explore the data Manipulating the data Checking the reliability of a scale Choosing the right statistic Part Four: Statistical techniques to explore relationships among variables Correlation Partial correlation Multiple regression Logistic regression Factor analysis Part Five: Statistical techniques to compare groups Non-parametric statistics T-tests One-way analysis of variance Two-way between-groups ANOVA Mixed between-within subjects analysis of variance Multivariate analysis of variance Analysis of covariance Appendix: Details of data files Recommended reading References Index

...read moreread less

6,868 citations

Proceedings Article•10.1117/12.671760•

CIAO: Chandra's data analysis system

[...]

Antonella Fruscione¹, Jonathan C. McDowell¹, Glenn E. Allen², Nancy S. Brickhouse¹, Douglas Burke¹, John E. Davis², Nick Durham¹, Martin Elvis¹, Elizabeth C. Galle¹, Daniel E. Harris¹, David P. Huenemoerder², John C. Houck², Bish Ishibashi², Margarita Karovska¹, Fabrizio Nicastro¹, Michael S. Noble², Michael A. Nowak², F. A. Primini¹, Aneta Siemiginowska¹, Randall K. Smith³, Michael W. Wise⁴ - Show less +17 more•Institutions (4)

Smithsonian Astrophysical Observatory¹, Massachusetts Institute of Technology², Goddard Space Flight Center³, ASTRON⁴

30 Jun 2006-Proceedings of SPIE

TL;DR: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes.

...read moreread less

Abstract: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes. From the earliest design discussions, CIAO was planned as a general-purpose scientific data analysis system optimized for X-ray astronomy, and consists mainly of command line tools (allowing easy pipelining and scripting) with a parameter-based interface layered on a flexible data manipulation I/O library. The same code is used for the standard Chandra archive pipeline, allowing users to recalibrate their data in a consistent way. We will discuss the lessons learned from the first six years of the software's evolution. Our initial approach to documentation evolved to concentrate on recipe-based "threads" which have proved very successful. A multi-dimensional abstract approach to data analysis has allowed new capabilities to be added while retaining existing interfaces. A key requirement for our community was interoperability with other data analysis systems, leading us to adopt standard file formats and an architecture which was as robust as possible to the input of foreign data files, as well as re-using a number of external libraries. We support users who are comfortable with coding themselves via a flexible user scripting paradigm, while the availability of tightly constrained pipeline programs are of benefit to less computationally-advanced users. As with other analysis systems, we have found that infrastructure maintenance and re-engineering is a necessary and significant ongoing effort and needs to be planned in to any long-lived astronomy software.

...read moreread less

1,475 citations

Patent•

Method and system for using file systems for content management

[...]

Sanjoy Chatterjee¹, George Ericsson¹, Roy E. Clark¹•Institutions (1)

EMC Corporation¹

28 Sep 2001

TL;DR: In this article, a file system includes at least one directory having at least 1 file containing data, but about which at least another file has no information, and a repository of metadata provides information about the data in the files.

...read moreread less

Abstract: A file system and method serves to create and manage content. The file system includes at least one directory having at least one file containing data, but about which at least one file has no information. A repository of metadata provides information about the data in the files. Phantom files are created which are designated by names and associated attributes, point to data in base files, without specifying a path name to the base files.

...read moreread less

1,110 citations

Patent•

Systems and methods for backing up data files

[...]

Christopher Midgley, John Webb

16 Dec 1999

TL;DR: In this paper, the authors propose a synchronization process that replicates selected source data files data stored on the network and creates a corresponding set of replicated data files, called the target data files that are stored on a back up server.

...read moreread less

Abstract: The invention provides systems and methods for continuous back up of data stored on a computer network. To this end the systems of the invention include a synchronization process that replicates selected source data files data stored on the network and to create a corresponding set of replicated data files, called the target data files, that are stored on a back up server. This synchronization process builds a baseline data structure of target data files. In parallel to this synchronization process, the system includes a dynamic replication process that includes a plurality of agents, each of which monitors a portion of the source data files to detect and capture, at the byte-level, changes to the source data files. Each agent may record the changes to a respective journal file, and as the dynamic replication process detects that the journal files contain data, the journal files are transferred or copied to the back up server so that the captured changes can be written to the appropriate ones of the target data files.

...read moreread less

969 citations

Proceedings Article•10.1109/ICDCS.2002.1022312•

Reclaiming space from duplicate files in a serverless distributed file system

[...]

John R. Douceur¹, Atul Adya¹, William J. Bolosky¹, P. Simon¹, Marvin M. Theimer¹ - Show less +1 more•Institutions (1)

Microsoft¹

2 Jul 2002

TL;DR: This work presents a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication, and includes convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys.

...read moreread less

Abstract: The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.

...read moreread less

811 citations

...

Expand

Year	Papers
2026	6
2025	55
2024	83
2023	190
2022	214
2021	224

Topic Tools

Papers published on a yearly basis

Papers

SPSS survival manual : a step by step guide to data analysis using SPSS

CIAO: Chandra's data analysis system

Method and system for using file systems for content management

Systems and methods for backing up data files

Reclaiming space from duplicate files in a serverless distributed file system

Related Topics (5)

Performance Metrics