Staged parser combinators for efficient data processing
Manohar Jonnalagedda,Thierry Coppey,Sandro Stucki,Tiark Rompf,Martin Odersky +4 more
- 15 Oct 2014
- Vol. 49, Iss: 10, pp 637-653
TL;DR: Staging is used, a form of runtime code generation, to dissociate input parsing from parser composition, and eliminate intermediate data structures and computations associated with parser composition at staging time.
read more
Abstract: Parsers are ubiquitous in computing, and many applications depend on their performance for decoding data efficiently Parser combinators are an intuitive tool for writing parsers: tight integration with the host language enables grammar specifications to be interleaved with processing of parse results Unfortunately, parser combinators are typically slow due to the high overhead of the host language abstraction mechanisms that enable composition We present a technique for eliminating such overhead We use staging, a form of runtime code generation, to dissociate input parsing from parser composition, and eliminate intermediate data structures and computations associated with parser composition at staging time A key challenge is to maintain support for input dependent grammars, which have no clear stage distinction Our approach applies to top-down recursive-descent parsers as well as bottom-up non-deterministic parsers with key applications in dynamic programming on sequences, where we auto-generate code for parallel hardware We achieve performance comparable to specialized, hand-written parsers
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Figure 2. Interface for staged parsers 
Figure 8. A backtrace for an instance of the matrix multiplication problem 
Figure 3. Using structs for parse results, and splitting 
Figure 6. An implementation of Foreach 
Figure 7. Threads progress jointly along matrix diagonal. Dependencies can be reduced to immediately preceding matrix cells. 
Figure 12. Zuker algorithm running time
Citations
Staged generic programming
Jeremy Yallop
- 29 Aug 2017
TL;DR: Applying structured multi-stage programming techniques transforms Scrap Your Boilerplate from an inefficient library into a typed optimising code generator, bringing its performance in line with hand-written code, and so combining high-level programming with uncompromised performance.
24
•Book
Reconciling Abstraction with High Performance: A Metaocaml Approach
Oleg Kiselyov
- 04 Jun 2018
TL;DR: This hands-on tutorial will eventually implement a simple domain-specific language (DSL) for linear algebra, with layers of optimizations for sparsity and memory layout of matrices and vectors, and their algebraic properties, and get the taste of the "Abstraction without guilt".
Parsing for agile modeling
Oscar Nierstrasz,Jan Kurs +1 more
TL;DR: This paper proposes an approach to "agile modeling" that exploits island grammars to extract initial coarse-grained models, parser combinators to enable gradual refinement of model importers, and various heuristics to recognize language structure, keywords and other language artifacts.
11
Staging with class: a specification for typed template Haskell
12 Jan 2022
TL;DR: In this paper , staged type class constraints are proposed for multi-stage programming using code quotation. But this approach is not suitable for overloading and code overloading in other languages that support overloading.
10
Accelerating parser combinators with macros
Eric Beguet,Manohar Jonnalagedda +1 more
- 28 Jul 2014
TL;DR: This paper uses Scala macros to analyse the grammar specification at compile-time and remove composition, leaving behind an efficient top-down, recursive-descent parser, which outperforms Scala's standard parser combinators on a set of benchmarks by an order of magnitude, and is 2x faster than code generated by LMS.
References
Vienna RNA secondary structure server
TL;DR: The Vienna RNA secondary structure server provides a web interface to the most frequently used functions of the Vienna RNA software package for the analysis of RNA secondary structures.
ANTLR: a predicated- LL(k) parser generator
Terence Parr,Russell W. Quong +1 more
TL;DR: ANTLR is introduced, a public‐domain parser generator that combines the flexibility of hand‐coded parsing with the convenience of a parser generator, which is a component of PCCTS.
753
Monads for Functional Programming
Philip Wadler
- 24 May 1995
TL;DR: Three case studies are looked at in detail: how monads ease the modication of a simple evaluator;How monads act as the basis of a datatype of arrays subject to in-place update; and how monad can be used to build parsers.
The Apache HTTP Server Project
Roy T. Fielding,Gail E. Kaiser +1 more
TL;DR: This collaborative software development effort has created a robust, feature-rich HTTP server software package that currently dominates the public Internet market and is more often attributed to performance than price.
588
Partial Evaluation of Computation Process—AnApproach to a Compiler-Compiler
Yoshihiko Futamura
- 01 Dec 1999
TL;DR: A method to automatically generate an actual compiler from a formal description which is, in some sense, the partial evaluation of a computation process is described.
474
Related Papers (5)
Jan Kurs,Jan Vraný,Mohammad Ghafari,Mircea Lungu,Oscar Nierstrasz +4 more
- 23 Aug 2016
Eric Beguet,Manohar Jonnalagedda +1 more
- 28 Jul 2014
Doaitse Swierstra,Atze Dijkstra +1 more
- 01 Jun 2001