Rematerialization

Topic Tools

Papers

Register allocation & spilling via graph coloring

[...]

01 Jun 1982-Sigplan Notices

TL;DR: In a previous paper as mentioned in this paper, we reported the successful use of graph coloring techniques for doing global register allocation in an experimental PL/I optimizing compiler, when the compiler cannot color the...

...read moreread less

Abstract: In a previous paper we reported the successful use of graph coloring techniques for doing global register allocation in an experimental PL/I optimizing compiler. When the compiler cannot color the ...

...read moreread less

885 citations

Journal Article•10.1145/177492.177575•

Improvements to graph coloring register allocation

[...]

Preston Briggs¹, Keith D. Cooper¹, Linda Torczon¹•Institutions (1)

Rice University¹

01 May 1994-ACM Transactions on Programming Languages and Systems

TL;DR: This paper describes two improvements to Chaitin-style graph coloring register allocators, and provides a detailed description of optimistic coloring and rematerialization, and presents experimental data to show the performance of several versions of the register allocator on a suite of FORTRAN programs.

...read moreread less

Abstract: We describe two improvements to Chaitin-style graph coloring register allocators. The first, optimistic coloring, uses a stronger heuristic to find a k-coloring for the interference graph. The second extends Chaitin's treatment of rematerialization to handle a larger class of values. These techniques are complementary. Optimistic coloring decreases the number of procedures that require spill code and reduces the amount of spill code when spilling is unavoidable. Rematerialization lowers the cost of spilling some values. This paper describes both of the techniques and our experience building and using register allocators that incorporate them. It provides a detailed description of optimistic coloring and rematerialization. It presents experimental data to show the performance of several versions of the register allocator on a suite of FORTRAN programs. It discusses several insights that we discovered only after repeated implementation of these allocators.

...read moreread less

429 citations

Journal Article•10.1002/(SICI)1097-024X(199608)26:8<929::AID-SPE40>3.3.CO;2-K•

Optimal and near-optimal global register allocations using 0–1 integer programming

[...]

David W. Goodwin¹, Kent Wilken¹•Institutions (1)

University of California, Davis¹

01 Aug 1996-Software - Practice and Experience

TL;DR: A fundamentally new approach to global register allocation is presented that optimally allocates registers and optimally places spill code, significantly decreasing spill code overhead compared with the traditional graph-coloring approach.

...read moreread less

Abstract: This paper presents a fundamentally new approach to global register allocation that optimally allocates registers and optimally places spill code, significantly decreasing spill code overhead compared with the traditional graph-coloring approach. The Optimal Register Allocation (ORA) approach formulates global register allocation as a 0–1 integer programming problem, incorporating all aspects of register allocation within a unified framework, including copy elimination, live range splitting, rematerialization, callee and caller register spilling, special instruction-operand requirements, and paired registers. A prototype ORA allocator is built into the Gnu C Compiler (GCC). For the SPEC92 integer benchmarks, the ORA allocator actually produces a net decrease of more than 100 million cycles across the entire benchmark set, because the dynamic copies the ORA allocator removes exceed the dynamic loads and stores that are inserted. In contrast, the GCC allocator and a Chaitin-style graph-coloring allocator each cause a net increase of more than 1 billion cycles. Because global register allocation is NP-complete, optimal register allocation has been considered intractable. However, the run-time complexity of the ORA approach is shown experimentally to be O(n3). A profile-guided hybrid allocation approach is proposed that uses the ORA allocator for the performance critical regions in the performance critical functions, while using a graph-coloring allocator for the non-critical functions and regions. An ORA-GCC hybrid allocator takes an average of 4.6 seconds per function to produce an allocation that is within 1% of optimal for 97% of the SPEC92 integer benchmark functions, showing that the hybrid allocator is practical as an advanced optimization for performance-critical codes.

...read moreread less

115 citations

Proceedings Article•10.1145/115790.115833•

Function materialization in object bases

[...]

Alfons Kemper¹, Christoph Kilger¹, Guido Moerkotte¹•Institutions (1)

Karlsruhe Institute of Technology¹

1 Apr 1991

TL;DR: This paper describes funct on materzakation as an optimization concept in object-oriented databases and concludes with a quantitative analysis of function materialization based on a sample performance benchmark obtained from the experimental object base system GOM.

...read moreread less

Abstract: We describe funct$on materzakation as an optimization concept in object-oriented databases. Exploiting the object-oriented paradigm—namely class lficatton, object identzty, and encapsalatzon-—facilitates a rather easy incorporation of function materialization mto (existing) object-oriented systems. Furthermore, the exploitation of encapsulation (information hiding) and object identity provides for additional performance tuning measures which drastically decrease the rematerialization overhead incurred by updates in the object base. The paper concludes with a quantitative analysis of function materialization based on a sample performance benchmark obtained from our experimental object base system GOM.

...read moreread less

59 citations

Posted Content•

Hardware Optimizations of Dense Binary Hyperdimensional Computing: Rematerialization of Hypervectors, Binarized Bundling, and Combinational Associative Memory

[...]

Manuel Schmuck¹, Luca Benini², Abbas Rahimi¹•Institutions (2)

ETH Zurich¹, University of Bologna²

20 Jul 2018-arXiv: Emerging Technologies

TL;DR: In this paper, the authors propose hardware techniques for optimizations of hyperdimensional computing, in a synthesizable VHDL library, to enable co-located implementation of both learning and classification tasks on only a small portion of Xilinx(R) UltraScale(TM) FPGAs.

...read moreread less

Abstract: Brain-inspired hyperdimensional (HD) computing models neural activity patterns of the very size of the brain's circuits with points of a hyperdimensional space, that is, with hypervectors. Hypervectors are $D$-dimensional (pseudo)random vectors with independent and identically distributed (i.i.d.) components constituting ultra-wide holographic words: $D = 10,000$ bits, for instance. At its very core, HD computing manipulates a set of seed hypervectors to build composite hypervectors representing objects of interest. It demands memory optimizations with simple operations for an e cient hardware realization. In this paper, we propose hardware techniques for optimizations of HD computing, in a synthesizable VHDL library, to enable co-located implementation of both learning and classification tasks on only a small portion of Xilinx(R) UltraScale(TM) FPGAs: (1) We propose simple logical operations to rematerialize the hypervectors on the fly rather than loading them from memory. These operations massively reduce the memory footprint by directly computing the composite hypervectors whose individual seed hypervectors do not need to be stored in memory. (2) Bundling a series of hypervectors over time requires a multibit counter per every hypervector component. We instead propose a binarized back-to-back bundling without requiring any counters. This truly enables on-chip learning with minimal resources as every hypervector component remains binary over the course of training to avoid otherwise multibit component. (3) For every classification event, an associative memory is in charge of finding the closest match between a set of learned hypervectors and a query hypervector by using a distance metric. This operator is proportional to [...]

...read moreread less

45 citations

...

Expand

Year	Papers
2021	4
2020	3
2019	4
2018	1
2015	1
2014	1

Topic Tools

Papers

Register allocation & spilling via graph coloring

Improvements to graph coloring register allocation

Optimal and near-optimal global register allocations using 0–1 integer programming

Function materialization in object bases

Hardware Optimizations of Dense Binary Hyperdimensional Computing: Rematerialization of Hypervectors, Binarized Bundling, and Combinational Associative Memory

Related Topics (5)

Performance Metrics