TL;DR: This article presents simpler proofs of the same results that Landi established that it is impossible to compute statically precise alias information—either may-alias or must-alias—in languages with if statements, loops, dynamic storage, and recursive data structures.
Abstract: Alias analysis is a prerequisite for performing most of the common program analyses such as reaching-definitions analysis or live-variables analysis. Landi [1992] recently established that it is impossible to compute statically precise alias information—either may-alias or must-alias—in languages with if statements, loops, dynamic storage, and recursive data structures: more precisely, he showed that the may-alias relation is not recursive, while the must-alias relation is not even recursively enumerable. This article presents simpler proofs of the same results.
TL;DR: Several new techniques for static branch prediction and profiling are presented, one of which combines multiple predictions of a branch's outcome into a prediction of the probability that the branch is taken.
Abstract: Program profiles identify frequently executed portions of a program, which are the places at which optimizations offer programmers and compilers the greatest benefit. Compilers, however, infrequently exploit program profiles, because, profiling a program requires a programmer to instrument and run the program. An attractive alternative is for the complier to statically estimate program profiles. This paper presents several new techniques for static branch prediction and profiling. The first technique combines multiple predictions of a branch's outcome into a prediction of the probability that the branch is taken. Another technique uses these predictions to estimate the relative execution frequency (i.e., profile) of basic blocks and control-flow edges within a procedure. A third algorithm uses local frequency estimates to predict the global frequency of calls, procedure invocations, and basic block and control-flow edge executions. Experiments on the SPEC92 integer benchmarks and Unix applications show that the frequently executed blocks, edges, and functions identified by our techniques closely match those in a dynamic profile.
TL;DR: A new form of program documentation that is precise, systematic and readable, which comprises a set of displays supplemented by a lexicon and an index and which presents a program fragment in such a way that its correctness can be examined without looking at any other display.
Abstract: Describes a new form of program documentation that is precise, systematic and readable. This documentation comprises a set of displays supplemented by a lexicon and an index. Each display presents a program fragment in such a way that its correctness can be examined without looking at any other display. Each display has three parts: (1) the specification of the program presented in the display, (2) the program itself, and (3) the specifications of programs invoked by this program. The displays are intended to be used by software engineers as a reference document during inspection and maintenance. This paper also introduces a specification technique that is a refinement of H.D. Mills's (1975) functional approach to program documentation and verification; programs are specified and described in tabular form. >
TL;DR: This paper presents a generalized algorithm that finds a path cover for a given program flowgraph, designed to cover all the unconstrained arcs of a given ddgraph, and can be employed to address the problem of infeasible paths.
Abstract: Branch testing a program involves generating a set of paths that will cover every arc in the program flowgraph, called a path cover, and finding a set of program inputs that will execute every path in the path cover. This paper presents a generalized algorithm that finds a path cover for a given program flowgraph. The analysis is conducted on a reduced flowgraph, called a ddgraph, and uses graph theoretic principles differently than previous approaches. In particular, the relations of dominance and implication which form two trees of the arcs of the ddgraph are exploited. These relations make it possible to identify a subset of ddgraph arcs, called unconstrained arcs, having the property that a set of paths exercising all the unconstrained arcs also cover all the arcs in the ddgraph. In fact, the algorithm has been designed to cover all the unconstrained arcs of a given ddgraph: the paths are derived one at a time, each path covering at least one as yet uncovered unconstrained arc. The greatest merits of the algorithm are its simplicity and its flexibility. It consists in just visiting recursively in combination the dominator and the implied trees, and is flexible in the sense that it can derive a path cover to satisfy different requirements, according to the strategy adopted for the selection of the unconstrained arc to be covered at each recursive iteration. This feature of the algorithm can be employed to address the problem of infeasible paths, by adopting the most suitable selection strategy for the problem at hand. Embedding of the algorithm into a software analysis and testing tool is recommended. >
TL;DR: This paper has used its link-time code modification system OM to perform program transformations related to global address use on the Alpha AXP, and describes the optimizations performed and shows their effects on program size and performance.
Abstract: Compilers for new machines with 64-bit addresses must generate code that works when the memory used by the program is large. Procedures and global variables are accessed indirectly via global address tables, and calling conventions include code to establish the addressability of the appropriate tables. In the common case of a program that does not require a lot of memory, all of this can be simplified considerably, with a corresponding reduction in program size and execution time.We have used our link-time code modification system OM to perform program transformations related to global address use on the Alpha AXP. Though simple, many of these arewhole-program optimizations that can be done only when we can see the entire program at once, so link-time is an ideal occasion to perform them.This paper describes the optimizations performed and shows their effects on program size and performance. Relatively modest transformations, possible without moving code, improve the performance of SPEC benchmarks by an average of 1.5%. More ambitious transformations, requiring an understanding of program structure that is thorough but not difficult at link-time, can do even better, reducing program size by 10% or more, and improving performance by an average of 3.8%.Even a program compiled monolithically with interprocedural optimization can benefit nearly as much from this technique, if it contains statically-linked pre-compiled library code. When the benchmark sources were compiled in this way, we were still able to improve their performance by 1.35% with the modest transformations and 3.4% with the ambitious transformations.
TL;DR: The conclusion is that goto statements can be accommodated in generating executable static slices in executable and nonexecutable programs.
Abstract: A static program slice is an extract of a program which can help our understanding of the behavior of the program; it has been proposed for use in debugging, optimization, parallelization, and integration of programs. This article considers two types of static slices: executable and nonexecutable. Efficient and well-founded methods have been developed to construct executable slices for programs without goto statements; it would be tempting to assume these methods would apply as well in programs with arbitrary goto statements. We show why previous methods do not work in this more general setting, and describe our solutions that correctly and efficiently compute executable slices for programs even with arbitrary goto statements. Our conclusion is that goto statements can be accommodated in generating executable static slices.
TL;DR: This paper provides a brief introduction to set constraints: what set constraints are, why they are interesting, the current state of the art, open problems, applications and implementations.
Abstract: Set constraints are a natural formalism for many problems that arise in program analysis. This paper provides a brief introduction to set constraints: what set constraints are, why they are interesting, the current state of the art, open problems, applications and implementations.
TL;DR: A number of analyses are developed, based on abstract interpretation, which succeed if a program is definitely suspension free, and it is proven that for these analyses it suffices to consider only one scheduling policy, allowing for efficient implementation.
Abstract: Concurrent logic languages specify reactive systems which consist of collections of communicating processes. The presence of unintended suspended computations is a common programming error which is difficult to detect using standard debugging and testing techniques. We develop a number of analyses, based on abstract interpretation, which succeed if a program is definitely suspension free. If an analysis fails, the program may, or may not, be suspension free. Examples demonstrate that the analyses are practically useful. They are conceptually simple and easy to justify because they are based directly on the transition system semantics of concurrent logic programs. A naive analysis must consider all scheduling policies. However, it is proven that for our analyses it suffices to consider only one scheduling policy, allowing for efficient implementation.
TL;DR: The original construction modeling computed answer substitutions, its compositional version, and various semantics modeling more concrete observables are discussed, and it is shown how the approach can be applied to several extensions of positive logic programs.
Abstract: This paper is a general overview of an approach to the semantics of logic programs whose aim is to find notions of models which really capture the operational semantics, and are, therefore, useful for defining program equivalences and for semantics-based program analysis. The approach leads to the introduction of extended interpretations which are more expressive than Herbrand interpretations. The semantics in terms of extended interpretations can be obtained as a result of both an operational (top-down) and a fixpoint (bottom-up) construction. It can also be characterized from the model-theoretic viewpoint, by defining a set of extended models which contains standard Herbrand models. We discuss the original construction modeling computed answer substitutions, its compositional version, and various semantics modeling more concrete observables. We then show how the approach can be applied to several extensions of positive logic programs. We finally consider some applications, mainly in the area of semantics-based program transformation and analysis.
TL;DR: The calculus of set constraints was presented, and its history of basic results and applications briefly described, and it was argued that set-based analysis can provide accurate and efficient program analysis.
Abstract: The calculus of set constraints was presented, and its history of basic results and applications briefly described. The approach of set-based analysis was then presented in an informal style, with a focus on the breadth of applicability of the technique. The relationship between set constraints and set-based analysis is roughly that the approximation of a program by ignoring inter-variable dependencies can be captured by set constraints. It was then argued that set-based analysis can provide accurate and efficient program analysis.
TL;DR: An interface for program slicing that allows slicing at the statement procedure, or file level, and provides fast visual feedback on slice structure is presented, integral to the interface is a global visualization of the program that shows the extent of a slice as it crosses procedure and file boundaries, and facilitates quick browsing of numerous slices.
Abstract: Program slicing is an automatic technique for determining which code in a program is relevant to a particular computation. Slicing has been applied in many areas, including program understanding, debugging, and maintenance. However, little attention has been paid to suitable interfaces for exploring program slices. We present an interface for program slicing that allows slicing at the statement procedure, or file level, and provides fast visual feedback on slice structure, integral to the interface is a global visualization of the program that shows the extent of a slice as it crosses procedure and file boundaries, and facilitates quick browsing of numerous slices. >
TL;DR: The results show how to engineer a compiler such that its optimization phase takes time proportional to the benefit, rather than the size, of such information.
Abstract: Recent advances in languages, software design methodologies, and architecture have prompted the development of improved compile-time methods for analyzing the effects of procedure calls, pointer references, and array accesses. Such sophistication, however, generally implies that compilers and programming environments will experience a corresponding increase in the volume of analysis information, which may be difficult to use efficiently. In this paper, we consider the practical accommodation of such information. Our results show how to engineer a compiler such that its optimization phase takes time proportional to the benefit, rather than the size, of such information. >
TL;DR: The Object Finder incorporates two complementary approaches, one being top-down and the other bottom-up, to assist the software engineer in understanding the structure of a program based on the object-like features found in the program.
Abstract: The maintenance or re-engineering of a program usually begins with considerable effort in understanding the program structure. In this paper, an interactive tool for understanding the structure of non-object-oriented programs, known as the Object Finder, is presented. The structure of a program is defined in terms of the groupings of routines and data into modules within the program and the hierarchical relationships among the modules. The Object Finder incorporates two complementary approaches, one being top-down and the other bottom-up, to assist the software engineer in understanding the structure of a program based on the object-like features found in the program. In the top-down approach, the user obtains an overall understanding of the entire program as a collection of hierarchically organized objects. In the bottom-up approach, the user obtains a view of the information which is closely related to the components under examination.
In both approaches, two methods are used to identify the objects in a program written in a non-object-oriented programming language. These methods identify sets of objects based on the data bindings and on the type bindings found in a program. An object is considered as a collection of routines, types, and/or data items of the program. The Object Finder combines the top-down and bottom-up approaches while using human input to guide the object identification process. Two examples of using this Object Finder are also presented to demonstrate the capabilities of the approaches to assist the software engineer in attaining a high-level understanding of the program structure.
TL;DR: The realization of parallel language systems that offer high-level programming paradigms to reduce the complexity of application development, scalable runtime mechanisms to support variable size problem sets, and portable compiler platforms to provide access to multiple parallel architectures places additional demands on the tools for program development and analysis.
Abstract: The realization of parallel language systems that offer high-level programming paradigms to reduce the complexity of application development, scalable runtime mechanisms to support variable size problem sets, and portable compiler platforms to provide access to multiple parallel architectures, places additional demands on the tools for program development and analysis. The need for integration of these tools into a comprehensive programming environment is even more pronounced and will require more sophisticated use of the language system technology (i.e., compiler and runtime system). Furthermore, the environment requirements of high-level support for the programmer, large-scale applications, and portable access to diverse machines also apply to the program analysis tools.
TL;DR: This work proposes a new approach for program understanding that is data-centered-it first focuses on data and data relationships.
Abstract: Software maintainers use a variety of techniques and representations for understanding programs. Most of these representations first focus on the control structure of a program such as call graphs, control flow graphs and paths. We propose a new approach for program understanding that is data-centered-it first focuses on data and data relationships. We have experimented on both small and large Cobol programs from industry to determine if our methods are useful for program understanding and software maintenance. We have developed DPUTE (Data-centered Program Understanding Tool Environment) that is currently being evaluated and enhanced by our industrial partners. >
TL;DR: It is argued that, in practice, a direct data flow analysis that relies on some amount of duplication would be as satisfactory as a CPS analysis.
Abstract: The widespread use of the continuation-passing style (CPS) transformation in compilers, optimizers, abstract interpreters, and partial evaluators reflects a common belief that the transformation has a positive effect on the analysis of programs. Investigations by Nielson [13] and Burn/Filho [5,6] support, to some degree, this belief with theoretical results. However, they do not pinpoint the source of increased abstract information and do not explain the observation of many people that continuation-passing confuses some conventional data flow analyses.To study the impact of the CPS transformation on program analysis, we derive three canonical data flow analyzers for the core of an applicative higher-order programming language. The first analyzer is based on a direct semantics of the language, the second on a continuation-semantics of the language, and the last on the direct semantics of CPS terms. All analyzers compute the control flow graph of the source program and hence our results apply to a large class of data flow analyses. A comparison of the information gathered by our analyzers establishes the following points:1. The results of a direct analysis of a source program are incomparable to the results of an analysis of the equivalent CPS program. In other words, the translation of the source program to a CPS version may increase or decrease static information. The gain of information occurs in non-distributive analyses and is solely due to the duplication of the analysis of the continuation. The loss of information is due to the confusion of distinct procedure returns.2. The analyzer based on the continuation semantics produces more accurate results than both direct analyzers, but again only in non-distributive analyses due to the duplication of continuations along every execution path. However, when the analyzer explicitly accounts for looping constructs, the results of the semantic-CPS analysis are no longer computable.In view of these results, we argue that, in practice, a direct data flow analysis that relies on some amount of duplication would be as satisfactory as a CPS analysis.
TL;DR: In this article, the authors present a system and process for making use of pre-existing data-structures which represent a computer program, in a way which has the advantages of shortening the time and cost required to create a new version of the computer program.
Abstract: The present invention provides a system and process for making use of pre-existing data-structures which represent a computer program, in a way which has the advantages of shortening the time and cost required to create a new version of the computer program. The pre-existing data-structure is modified to produce a shadow data-structure which contains only shadows of those elements or nodes of the pre-existing data-structure required to perform the tasks of the new version of the computer program. The present invention includes processes to make the data-structure of the original program shadowable; processes to use data from the original program compilation process in compiling the new version of the program, including processes to create a shadow data-structure; and processes to use the new version of the computer program along with the shadow data-structure to create the desired execution. This new version of the computer program is typically a tool for checking or observing the original program's execution in some manner. Moreover, the system and processes disclosed provide mechanisms for a software manufacturer to create type-safe versions of a connected collection of objects which are dynamically extensible.
TL;DR: This paper presents a powerful description language for expressing the aliasing properties of dynamic date structures, and demonstrates how such descriptions provide the compiler with better information during alias analysis, and require only minimal effort from the programmer.
Abstract: High-performance architectures rely upon powerful optimizing and parallelizing compilers to maximize performance. Such compilers need accurate program analysis to enable their performance-enhancing transformations. In the domain of program analysis for parallelization, pointer analysis is a difficult and increasingly common problem. When faced with dynamic, pointer-based data structures, existing solutions are either too limited in the types of data structures they can analyze, or require too much effort on the part of the programmer. In this paper we present a powerful description language for expressing the aliasing properties of dynamic date structures. Such descriptions provide the compiler with better information during alias analysis, and require only minimal effort from the programmer. Ultimately, this enables a more accurate program analysis, and an increased application of performance-enhancing transformations. >
TL;DR: A detailed look at a larger example of program analysis by transformation, carried out in the WSL language, a «wide spectrum language» which includes both low-level program operations and high level specifications, and which has been specifically designed to be easy to transform.
Abstract: In this paper we will take a detailed look at a larger example of program analysis by transformation. We will be considering Algorithm 2.3.3.A from Knuth's «Fundamental Algorithms» Knuth (1968) (p. 357) which is an algorithm for the addition of polynomials represented using four-directional links. Knuth (1974) describes this as having «a complicated structure with excessively unrestrained goto statements» and goes on to say «I hope someday to see the algorithm cleaned up without loss of its efficiency». Our aim is to manipulate the program, using semantics-preserving operations, into an equivalent, high-level specification. The transformations are carried out in the WSL language, a «wide spectrum language» which includes both low-level program operations and high level specifications, and which has been specifically designed to be easy to transform
TL;DR: The aim is to derive run-time properties that can be used at compile time to specialize the target code for a program according to a given set of queries and to automatically introduce destructive assignments in a safe and transparent way so that fewer garbage cells are created.
Abstract: For the class of applicative programming languages, efficient methods for reclaiming the memory occupied by released data structures constitute an important aspect of current implementations. The present article addresses the problem of memory reuse for logic programs through program analysis rather than by run-time garbage collection. The aim is to derive run-time properties that can be used at compile time to specialize the target code for a program according to a given set of queries and to automatically introduce destructive assignments in a safe and transparent way so that fewer garbage cells are created.The dataflow analysis is constructed as an application of abstract interpretation for logic programs. An abstract domain for describing structure-sharing and liveness properties is developed as are primitive operations that guarantee a sound and terminating global analysis. We explain our motivation for the design of the abstract domain, make explicit the underlying implementation assumptions, and discuss the precision of the results obtained by a prototype analyzer.
TL;DR: A system based on the notion of a flow graph is used to specify formally and to implement a compiler for a lazy functional language, which provides a single, unified, efficient, formal framework for all the analysis and synthesis phases, including the generation of C.
Abstract: A system based on the notion of a flow graph is used to specify formally and to implement a compiler for a lazy functional language. The compiler takes a simple functional language as input and generates C. The generated C program can then be compiled, and loaded with an extensive run-time system to provide the facility to experiment with different analysis techniques. The compiler provides a single, unified, efficient, formal framework for all the analysis and synthesis phases, including the generation of C. Many of the standard techniques, such as strictness and boxing analyses, have been included.
TL;DR: Global program analyses of untyped higher-order functional programs have in the past decade been presented by Ayers, Bondorf, Consel, Jones, Sestoft, Shivers, and others and all contain a global closure analysis that computes information about higher- order control-flow.
Abstract: Global program analyses of untyped higher-order functional programs have in the past decade been presented by Ayers, Bondorf, Consel, Jones, Sestoft, Shivers, and others. The analyses are usually defined as abstract interpretations and are used for rather different tasks such as type recovery, globalization, and binding-time analysis. The analyses all contain a global closure analysis that computes information about higher-order control-flow. Sestoft proved in 1989 and 1991 that closure analysis is correct with respect to call-by-name and call-by-value semantics, but it remained open if correctness holds for arbitrary betareduction.
TL;DR: The paper reports on a project at Caltech which is exploring the question: can a library of parallel program archetypes be used to reduce the effort required to produce correct efficient programs.
Abstract: A program archetype is a program design strategy appropriate for a restricted class of problems, and a collection of program designs with implementations of examplar problems in one or more programming languages and optimized for a collection of target machines. The program design strategy includes: archetype specific information about methods of deriving a program from a specification; methods of parallelizing sequential programs; the program structure; methods of reasoning about correctness and performance; empirical data on performance measurements and tuning for different kinds of machines; and suggestions for test suites. The paper reports on a project at Caltech which is exploring the question: can a library of parallel program archetypes be used to reduce the effort required to produce correct efficient programs?. >
TL;DR: This paper represents an instrumentation method for efficiently counting events in a program's execution, with support for on-line queries of the event count, and guarantees that accurate event counts can be obtained efficiently at every point in the execution.
Abstract: The ability to count events in a program's execution is required by many program analysis applications. We represent an instrumentation method for efficiently counting events in a program's execution, with support for on-line queries of the event count. Event counting differs from basic block profiling in that an aggregate count of events is kept rather than a set of counters. Due to this difference, solutions to basic block profiling are not well suited to event counting. Our algorithm finds a subset of points in a program to instrument, while guaranteeing that accurate event counts can be obtained efficiently at every point in the execution.
TL;DR: An optimized general-purpose algorithm for polyvariant, static analyses of higher-order applicative programs, which is parameterized over both the abstract domain and degree of polyvariance.
Abstract: This paper presents an optimized general-purpose algorithm for polyvariant, static analyses of higher-order applicative programs. A polyvariant analysis is a very accurate form of analysis that produces many more abstract descriptions for a program than does a conventional analysis. It may also compute intermediate abstract descriptions that are irrelevant to the final result of the analysis. The optimized algorithm addresses this overhead while preserving the accuracy of the analysis. The algorithm is also parameterized over both the abstract domain and degree of polyvariance. We have implemented an instance of our algorithm and evaluated its performance compared to the unoptimized algorithm. Our implementation runs significantly faster on average than the other algorithm for benchmarks reported here.
TL;DR: The problems of incomplete or unsound informal analysis are analyzed, the relationship of QDA to other analysis methods is discussed, and suggested improvements to the QDA method are described.
Abstract: Formal verification of program properties may be infeasible or impractical, and informal analysis may be sufficient. Informal analysis involves the informal acceptance, by inspection, of the validity of program properties or steps in an analysis. Informal analysis may also involve abstraction. Abstraction can be used to eliminate details and concentrate on more general properties. Abstraction will result in informal analysis if it includes the use of undefined properties. A systematic, informal method for analysis called QDA (Quick Defect Analysis) is described. QDA is a comments analysis process based on facts and hypotheses. Facts are used to create an abstract program model, and hypotheses are selected, nonobvious program properties which are identified as needing verification. Hypotheses are proved from the facts that define an abstraction. QDA is hypothesis-driven in the sense that only those parts of an abstraction that are needed to prove hypotheses are created. The QDA approach was applied to a previously well tested operational flight program (OFP). The QDA method and the results of the OFP experiment are presented. The problems of incomplete or unsound informal analysis are analyzed, the relationship of QDA to other analysis methods is discussed, and suggested improvements to the QDA method are described. >
TL;DR: Cachier is described, a tool that automatically inserts CICO annotations into shared-memory programs that use both dynamic information obtained from a program execution trace, as well as static information, obtained from program analysis.
Abstract: Shared memory in a parallel computer provides programmers with the valuable abstraction of a shared address space--through which any part of a computation can access any datum Although uniform access simplifies programming, it also hides communication, which can lead to inefficient programs The check-in, check-out (CICO) performance model for cache-coherent, shared-memory parallel computers helps a programmer identify the communication underlying memory references and account for its cost CICO consists of annotations that a programmer can use to elucidate communication and a model that attributes costs to these annotations The annotations can also serve as directives to a memory system to improve program performance Inserting CICO annotations requires reasoning about the dynamic cache behavior of a program, which is not always easy This paper describes Cachier, a tool that automatically inserts CICO annotations into shared-memory programs A novel feature of this tool is its use of both dynamic information, obtained from a program execution trace, as well as static information, obtained from program analysis We measured several benchmarks annotated by Cachier by running them on a simulation of the DiriSW cache coherence protocol [10], which supports these directives The results show that programs annotated by Cachier perform significantly better than both programs without CICO annotations and programs that were annotated by hand
TL;DR: This paper defines a special class of graph rewrite systems for program analysis: edge addition rewrite systemsEars (Ears), and shows that Ears are very well suited for generating efficient program analyzers.
Abstract: In this paper we define a special class of graph rewrite systems for program analysis: edge addition rewrite systemsEars). Ears can be applied to distributive data-flow frameworks over finite lattices [Hec77] [RSH94], as well as many other program analysis problems. We also present some techniques for optimized evaluation of Ears. They show that Ears are very well suited for generating efficient program analyzers.
TL;DR: In this article, a lazy program generation strategy is proposed to generate only those parts of the program that are indispensable for processing the particular data at hand, which is called lazy program generator.
Abstract: Current program generators usually operate in a greedy manner in the sense that a program must be generated in its entirety before it can be used. If generation time is scarce, or if the input to the generator is subject to modification, it may be better to be more cautious and to generate only those parts of the program that are indispensable for processing the particular data at hand. We call this lazy program generation. Another, closely related strategy is incremental program generation. When its input is modified, an incremental generator will try to make a corresponding modification in its output rather than generate a completely new program. It may be advantageous to use a combination of both strategies in program generators that have to operate in a highly dynamic and/or interactive environment.
TL;DR: A specific method, reachability, is presented as an example to accomplish context projection, and Experimental results using reachability show very convincing speedups that demonstrate the practical significance of context projection.
Abstract: A chief source of inefficiency in program analysis using abstract interpretation comes from the fact that a large context (i.e., problem state) is propagated from node to node during the course of an analysis. This problem can be addressed and largely alleviated by a technique we call context projection, which projects an input context for a node to the portion that is actually relevant and determines whether the node should be reevaluated based on the projected context. This technique reduces the cost of an evaluation and eliminates unnecessary evaluations. Therefore, the efficiency of computing fixpoints over general lattices is greatly improved. A specific method, reachability, is presented as an example to accomplish context projection. Experimental results using reachability show very convincing speedups (more than eight for larger programs) that demonstrate the practical significance of context projection.