Structure-Grounded Pretraining for Text-to-SQL
Xiang Deng,Ahmed Hassan Awadallah,Christopher A. Meek,Oleksandr Polozov,Huan Sun,Matthew Richardson +5 more
- 01 Jun 2021
- pp 1337-1350
TL;DR: STRUG as mentioned in this paper is a weakly supervised structure-grounded pretraining framework for text-to-SQL that can effectively learn to capture text-table alignment based on a parallel text table corpus.
read more
Abstract: Learning to capture text-table alignment is essential for tasks like text-to-SQL. A model needs to correctly recognize natural language references to columns and values and to ground them in the given database schema. In this paper, we present a novel weakly supervised Structure-Grounded pretraining framework (STRUG) for text-to-SQL that can effectively learn to capture text-table alignment based on a parallel text-table corpus. We identify a set of novel pretraining tasks: column grounding, value grounding and column-value mapping, and leverage them to pretrain a text-table encoder. Additionally, to evaluate different methods under more realistic text-table alignment settings, we create a new evaluation set Spider-Realistic based on Spider dev set with explicit mentions of column names removed, and adopt eight existing text-to-SQL datasets for cross-database evaluation. STRUG brings significant improvement over BERTLARGE in all settings. Compared with existing pretraining methods such as GRAPPA, STRUG achieves similar performance on Spider, and outperforms all baselines on more realistic sets. All the code and data used in this work will be open-sourced to facilitate future research.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
SmBoP: Semi-autoregressive Bottom-up Semantic Parsing
Ohad Rubin,Jonathan Berant +1 more
- 01 Jun 2021
TL;DR: This work proposes an alternative approach to semantic parsing: a Semi-autoregressive Bottom-up Parser (SmBoP) that constructs at decoding step t the top-K sub-trees of height ≤ t, and shows that SmBoP leads to a 2.2x speed-up in decoding time and a ~5x speeds up in training time, compared to a semantic parser that uses autoregressive decoding.
LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations
Ruisheng Cao,Lu Chen,Zhi Chen,Yanbin Zhao,Su Zhu,Kai Yu +5 more
- 01 Aug 2021
TL;DR: Li et al. as mentioned in this paper proposed a Line Graph Enhanced Text-to-SQL (LGESQL) model to mine the underlying relational features without constructing meta-paths, which can propagate more efficiently through not only connections between nodes, but also the topology of directed edges.
•Posted Content
TAPEX: Table Pre-training via Learning a Neural SQL Executor.
TL;DR: TAPEX as discussed by the authors learns a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries and improves the performance on downstream tasks, boosting existing language models by at most 19.5%.
83
•Posted Content
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Peng Shi,Patrick Ng,Zhiguo Wang,Henghui Zhu,Alexander Hanbo Li,Jun Wang,Cicero Nogueira dos Santos,Bing Xiang +7 more
TL;DR: Generative-Augmented Pre-training (GAP) as mentioned in this paper learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
50
RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL
TL;DR: The authors proposes a ranking-enhanced encoding and skeleton-aware decoding framework to decouple the schema linking and the skeleton parsing in a seq2seq encoder-decoder model.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
82.5K
•Posted Content
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Michael Lewis,Luke Zettlemoyer,Veselin Stoyanov +9 more
TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova +3 more
- 11 Oct 2018
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
24.6K
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Michael Lewis,Yinhan Liu,Naman Goyal,Marjan Ghazvininejad,Abdelrahman Mohamed,Omer Levy,Veselin Stoyanov,Luke Zettlemoyer +7 more
- 01 Jul 2020
TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.
Related Papers (5)
John M. Zelle,Raymond J. Mooney +1 more
- 04 Aug 1996
Ilya Loshchilov,Frank Hutter +1 more
- 27 Sep 2018