External Data Representation

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Journal Article•10.1145/357980.358007•

A relational model of data for large shared data banks

[...]

E. F. Codd¹•Institutions (1)

IBM¹

01 Jun 1970-Communications of The ACM

TL;DR: In this article, a model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced, and certain operations on relations are discussed and applied to the problems of redundancy and consistency in the user's model.

...read moreread less

Abstract: Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation). A prompting service which supplies such information is not a satisfactory solution. Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed. Changes in data representation will often be needed as a result of changes in query, update, and report traffic and natural growth in the types of stored information.Existing noninferential, formatted data systems provide users with tree-structured files or slightly more general network models of the data. In Section 1, inadequacies of these models are discussed. A model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced. In Section 2, certain operations on relations (other than logical inference) are discussed and applied to the problems of redundancy and consistency in the user's model.

...read moreread less

5,496 citations

A Relational Model of Data Large Shared Data Banks

[...]

E. F. Codd

1 Jan 1970

TL;DR: In this paper, a model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced, and certain operations on relations are discussed and applied to the problems of redundancy and consistency in the user's model.

...read moreread less

Abstract: Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation). A prompting service which supplies such information is not a satisfactory solution. Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed. Changes in data representation will often be needed as a result of changes in query, update, and report traffic and natural growth in the types of stored information. Existing noninferential, formatted data systems provide users with tree-structured files or slightly more general network models of the data. In Section 1, inadequacies of these models are discussed. A model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced. In Section 2, certain operations on relations (other than logical inference) are discussed and applied to the problems of redundancy and consistency in the user's model.

...read moreread less

4,416 citations

Journal Article•10.1109/TPAMI.2010.231•

Graph Regularized Nonnegative Matrix Factorization for Data Representation

[...]

Deng Cai¹, Xiaofei He¹, Jiawei Han², Thomas S. Huang²•Institutions (2)

Zhejiang University¹, University of Illinois at Urbana–Champaign²

01 Aug 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In GNMF, an affinity graph is constructed to encode the geometrical information and a matrix factorization is sought, which respects the graph structure, and the empirical study shows encouraging results of the proposed algorithm in comparison to the state-of-the-art algorithms on real-world problems.

...read moreread less

Abstract: Matrix factorization techniques have been frequently applied in information retrieval, computer vision, and pattern recognition. Among them, Nonnegative Matrix Factorization (NMF) has received considerable attention due to its psychological and physiological interpretation of naturally occurring data whose representation may be parts based in the human brain. On the other hand, from the geometric perspective, the data is usually sampled from a low-dimensional manifold embedded in a high-dimensional ambient space. One then hopes to find a compact representation,which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure. In this paper, we propose a novel algorithm, called Graph Regularized Nonnegative Matrix Factorization (GNMF), for this purpose. In GNMF, an affinity graph is constructed to encode the geometrical information and we seek a matrix factorization, which respects the graph structure. Our empirical study shows encouraging results of the proposed algorithm in comparison to the state-of-the-art algorithms on real-world problems.

...read moreread less

2,543 citations

Book•

Principles of program analysis

[...]

Flemming Nielson, Hanne Riis Nielson, Chris Hankin

22 Oct 1999

TL;DR: This book is unique in providing an overview of the four major approaches to program analysis: data flow analysis, constraint-based analysis, abstract interpretation, and type and effect systems.

...read moreread less

Abstract: Program analysis utilizes static techniques for computing reliable information about the dynamic behavior of programs. Applications include compilers (for code improvement), software validation (for detecting errors) and transformations between data representation (for solving problems such as Y2K). This book is unique in providing an overview of the four major approaches to program analysis: data flow analysis, constraint-based analysis, abstract interpretation, and type and effect systems. The presentation illustrates the extensive similarities between the approaches, helping readers to choose the best one to utilize.

...read moreread less

2,199 citations

Journal Article•10.1016/J.PARCO.2004.04.001•

The ganglia distributed monitoring system: design, implementation, and experience

[...]

Matt Massie¹, Brent N. Chun², David E. Culler¹•Institutions (2)

University of California, Berkeley¹, Intel²

1 Jul 2004

TL;DR: The design, implementation, and evaluation of Ganglia are presented along with experience gained through real world deployments on systems of widely varying scale, configurations, and target application domains over the last two and a half years.

...read moreread less

Abstract: Ganglia is a scalable distributed monitoring system for high performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It relies on a multicast-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on over 500 clusters around the world. This paper presents the design, implementation, and evaluation of Ganglia along with experience gained through real world deployments on systems of widely varying scale, configurations, and target application domains over the last two and a half years.

...read moreread less

1,534 citations

...

Expand

Year	Papers
2025	3
2024	6
2023	23
2022	58
2021	201
2020	183

Topic Tools

Papers published on a yearly basis

Papers

A relational model of data for large shared data banks

A Relational Model of Data Large Shared Data Banks

Graph Regularized Nonnegative Matrix Factorization for Data Representation

Principles of program analysis

The ganglia distributed monitoring system: design, implementation, and experience

Related Topics (5)

Performance Metrics