Structure-Grounded Pretraining for Text-to-SQL

doi:10.18653/V1/2021.NAACL-MAIN.105

Open AccessProceedings Article10.18653/V1/2021.NAACL-MAIN.105

Structure-Grounded Pretraining for Text-to-SQL

Xiang Deng, +5 more

- 01 Jun 2021

- pp 1337-1350

79

TL;DR: STRUG as mentioned in this paper is a weakly supervised structure-grounded pretraining framework for text-to-SQL that can effectively learn to capture text-table alignment based on a parallel text table corpus.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/2021.NAACL-MAIN.29

SmBoP: Semi-autoregressive Bottom-up Semantic Parsing

Ohad Rubin, +1 more

- 01 Jun 2021

TL;DR: This work proposes an alternative approach to semantic parsing: a Semi-autoregressive Bottom-up Parser (SmBoP) that constructs at decoding step t the top-K sub-trees of height ≤ t, and shows that SmBoP leads to a 2.2x speed-up in decoding time and a ~5x speeds up in training time, compared to a semantic parser that uses autoregressive decoding.

...read moreread less

156

•Proceedings Article•10.18653/V1/2021.ACL-LONG.198

LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations

Ruisheng Cao, +5 more

- 01 Aug 2021

TL;DR: Li et al. as mentioned in this paper proposed a Line Graph Enhanced Text-to-SQL (LGESQL) model to mine the underlying relational features without constructing meta-paths, which can propagate more efficiently through not only connections between nodes, but also the topology of directed edges.

...read moreread less

95

•Posted Content

TAPEX: Table Pre-training via Learning a Neural SQL Executor.

Qian Liu, +4 more

- 16 Jul 2021

- arXiv: Computation and Language

TL;DR: TAPEX as discussed by the authors learns a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries and improves the performance on downstream tasks, boosting existing language models by at most 19.5%.

...read moreread less

83

•Posted Content

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Peng Shi, +7 more

- 18 Dec 2020

- arXiv: Computation and Language

TL;DR: Generative-Augmented Pre-training (GAP) as mentioned in this paper learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.

...read moreread less

50

•Journal Article•10.1609/aaai.v37i11.26535

RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL

26 Jun 2023

- Proceedings of the ... AAAI Conference o...

TL;DR: The authors proposes a ranking-enhanced encoding and skeleton-aware decoding framework to decouple the schema linking and the skeleton parsing in a seq2seq encoder-decoder model.

...read moreread less

47

...

Expand

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Posted Content

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 22 Dec 2014

- arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

82.5K

•Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019

- arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

26.2K

Proceedings Article•10.18653/V1/N19-1423

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

24.6K

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.703

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Michael Lewis, +7 more

- 01 Jul 2020

TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.

...read moreread less

11.5K

...

Expand

Structure-Grounded Pretraining for Text-to-SQL

Chat with Paper

AI Agents for this Paper

Citations

SmBoP: Semi-autoregressive Bottom-up Semantic Parsing

LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations

TAPEX: Table Pre-training via Learning a Neural SQL Executor.

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL

References

Adam: A Method for Stochastic Optimization

Adam: A Method for Stochastic Optimization

RoBERTa: A Robustly Optimized BERT Pretraining Approach

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Related Papers (5)

Learning to parse database queries using inductive logic programming

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation

Decoupled Weight Decay Regularization.