Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Database Programming Languages
  4. 2019
  1. Home
  2. Conferences
  3. Database Programming Languages
  4. 2019
Showing papers presented at "Database Programming Languages in 2019"
Proceedings Article•10.1145/3315507.3330199•
Arc: an IR for batch and stream programming

[...]

Lars Kroll1, Klas Segeljakt1, Paris Carbone, Christian Schulte1, Seif Haridi1 •
Royal Institute of Technology1
23 Jun 2019
TL;DR: This work proposes Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models.
Abstract: In big data analytics, there is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests itself today by highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimization and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation and adds support for partitioned, out-of-order stream and window operators which are the most fundamental building blocks in contemporary data streaming.

9 citations

Proceedings Article•10.1145/3315507.3330198•
Language-integrated provenance by trace analysis

[...]

Stefan Fehrenbach1, James Cheney1•
University of Edinburgh1
23 Jun 2019
TL;DR: TLinks as mentioned in this paper is an extension to a core language for Links called TLinks, which is used to define different forms of provenance as user code, together with existing techniques for type-directed generic programming.
Abstract: Language-integrated provenance builds on language-integrated query techniques to make provenance information explaining query results readily available to programmers. In previous work we have explored language-integrated approaches to provenance in and . However, implementing a new form of provenance in a language-integrated way is still a major challenge. We propose a self-tracing transformation and trace analysis features that, together with existing techniques for type-directed generic programming, make it possible to define different forms of provenance as user code. We present our design as an extension to a core language for Links called TLinks, give examples showing its capabilities, and outline its metatheory and key correctness properties.

8 citations

Proceedings Article•10.1145/3315507.3330200•
Towards compiling graph queries in relational engines

[...]

Ruby Y. Tahboub1, Xilun Wu1, Grégory M. Essertel1, Tiark Rompf1•
Purdue University1
23 Jun 2019
TL;DR: This paper extends the LB2 main-memory query compiler with graph adjacency structures and operators and implements a subset of the Datalog logical query language evaluation to enable processing graph and recursive queries efficiently.
Abstract: The increasing demand for graph query processing has prompted the addition of support for graph workloads on top of standard relational database management systems (RDBMS). Although this appears like a good idea --- after all, graphs are just relations --- performance is typically suboptimal since graph workloads are naturally iterative and rely extensively on efficient traversal of adjacency structures that are not typically implemented in an RDBMS. Adding such specialized adjacency structures is not at all straightforward due to the complexity of typical RDBMS implementations. The iterative nature of graph queries also practically requires a form of runtime compilation and native code generation which adds another dimension of complexity to the RDBMS implementation and any potential extensions. In this paper, we demonstrate how the idea of the first Futamura projection, which links interpreted query engines and compilers through specialization, can be applied to compile graph workloads in an efficient way that simplifies the construction of relational engines which also support graph workloads. We extend the LB2 main-memory query compiler with graph adjacency structures and operators. We implement a subset of the Datalog logical query language evaluation to enable processing graph and recursive queries efficiently. The graph extension matches, and sometimes outperforms, best-of-breed low-level graph engines.

6 citations

Proceedings Article•10.1145/3315507.3330197•
Fluid data structures

[...]

Darshana Balakrishnan1, Lukasz Ziarek1, Oliver Kennedy1•
University at Buffalo1
23 Jun 2019
TL;DR: This paper proposes Fluid data structures, an approach to data structure design that allows limited physical changes that preserve logical equivalence, and designs a lazy-loading map that is a Fluid Cog, a lock-free data structure that incrementally organizes itself in the background by applying equivalence-preserving structural transformations.
Abstract: Functional (aka immutable) data structures are used extensively in data management systems. From distributed systems to data persistence, immutability makes complex programs significantly easier to reason about and implement. However, immutability also makes many runtime optimizations like tree rebalancing, or adaptive organizations, unreasonably expensive. In this paper, we propose Fluid data structures, an approach to data structure design that allows limited physical changes that preserve logical equivalence. As we will show, this approach retains many of the desirable properties of functional data structures, while also allowing runtime adaptation. To illustrate Fluid data structures, we work through the design of a lazy-loading map that we call a Fluid Cog. A Fluid Cog is a lock-free data structure that incrementally organizes itself in the background by applying equivalence-preserving structural transformations. Our experimental analysis shows that the resulting map structure is flexible enough to adapt to a variety of performance goals, while remaining competitive with existing structures like the C++ standard template library map.

4 citations

Proceedings Article•10.1145/3315507.3330196•
On the semantics of Cypher's implicit group-by

[...]

Filip Murlak1, Jan Posiadała, Paweł Susicki•
University of Warsaw1
23 Jun 2019
TL;DR: This paper stems from Cypher.PL, a project aimed at creating an executable (and readable) semantics of Cypher in Prolog, and focuses on Cypher's implicit group-by feature.
Abstract: Cypher is a popular declarative query language for property graphs. Despite having been adopted by several graph database vendors, it lacks a comprehensive semantics other than the reference implementation. This paper stems from Cypher.PL, a project aimed at creating an executable (and readable) semantics of Cypher in Prolog, and focuses on Cypher's implicit group-by feature. Rather than being explicitly specified in the query, in Cypher the grouping key is derived from the return expressions. We show how this becomes problematic when a single return expression mixes unaggregated property references and aggregating functions, and discuss ways of giving this construct a proper semantics without defying common sense.

1 citations

Proceedings Article•10.1145/3315507.3330195•
Detecting unsatisfiable CSS rules in the presence of DTDs

[...]

Nobutaka Suzuki1, Takuya Okada1, Yeondae Kwon2•
University of Tsukuba1, University of Tokyo2
23 Jun 2019
TL;DR: This paper focuses on CSS fragments in which descendant, child, adjacent sibling, and general sibling combinators are allowed and shows that the problem is coNP-hard in most cases, even if only one of the four combinators is allowed.
Abstract: Cascading Style Sheets (CSS) is a popular language for describing the styles of XML documents as well as HTML documents. For a DTD D and a list R of CSS rules, due to specificity R may contain “unsatisfiable” rules under D, e.g., rules that are not applied to any element of any document valid to D. In this paper, we consider the problem of detecting unsatisfiable CSS rules under DTDs. We focus on CSS fragments in which descendant, child, adjacent sibling, and general sibling combinators are allowed. We show that the problem is coNP-hard in most cases, even if only one of the four combinators is allowed. We also show that the problem is in coNP or PSPACE depending on restrictions on DTDs and CSS. Finally, we present two conditions under which the problem can be solved in polynomial time.

1 citations

Proceedings Article•10.1145/3315507.3330201•
Streaming saturation for large RDF graphs with dynamic schema information

[...]

Mohammad Amin Farvardin1, Dario Colazzo1, Khalid Belhajjame1, Carlo Sartiani2•
Paris Dauphine University1, University of Basilicata2
23 Jun 2019
TL;DR: This work presents the first solution for reasoning over large streams of RDF data using big data platforms by focusing on the saturation operation, which seeks to infer implicit RDF triples given RDF schema constraints.
Abstract: In the Big Data era, RDF data are produced in high volumes. While there exist proposals for reasoning over large RDF graphs using big data platforms, there is a dearth of solutions that do so in environments where RDF data are dynamic, and where new instance and schema triples can arrive at any time. In this work, we present the first solution for reasoning over large streams of RDF data using big data platforms. In doing so, we focus on the saturation operation, which seeks to infer implicit RDF triples given RDF schema constraints. Indeed, unlike existing solutions which saturate RDF data in bulk, our solution carefully identifies the fragment of the existing (and already saturated) RDF dataset that needs to be considered given the fresh RDF statements delivered by the stream. Thereby, it performs the saturation in an incremental manner. Experimental analysis shows that our solution outperforms existing bulk-based saturation solutions.
Proceedings Article•10.1145/3315507.3330202•
Mixing set and bag semantics

[...]

Wilmer Ricciotti1, James Cheney1•
University of Edinburgh1
23 Jun 2019
TL;DR: This paper introduces a calculus with both set and multiset collection types, along with natural mappings from sets to bags and vice versa, presents a set of valid rewrite rules for normalizing such queries, and gives an inductive characterization of aSet of queries whose normal forms can be translated to SQL.
Abstract: The conservativity theorem for nested relational calculus implies that query expressions can freely use nesting and unnesting, yet as long as the query result type is a flat relation, these capabilities do not lead to an increase in expressiveness over flat relational queries. Moreover, Wong showed how such queries can be translated to SQL via a constructive rewriting algorithm. While this result holds for queries over either set or multiset semantics, to the best of our knowledge, the questions of conservativity and ormalization have not been studied for queries that mix set and bag collections, or provide duplicate-elimination operations such as SQL's SELECT DISTINCT. In this paper we formalize the problem, and present partial progress: specifically, we introduce a calculus with both set and multiset collection types, along with natural mappings from sets to bags and vice versa, present a set of valid rewrite rules for normalizing such queries, and give an inductive characterization of a set of queries whose normal forms can be translated to SQL. We also consider examples that do not appear straightforward to translate to SQL, illustrating that the relative expressiveness of flat and nested queries with mixed set and multiset semantics remains an open question.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve