Open AccessPosted Content
Byte-Pair Encoding for Text-to-SQL Generation.
Samuel Müller,Andreas Vlachos +1 more
TL;DR: A novel stopping criterion is presented that prevents overfitting the BPE encoding to the training set and AST BPE is presented, which is a version of BPE that uses the Abstract Syntax Tree of the SQL statement to guide BPE merges and therefore produce BPE encodings that generalize better.
read more
Abstract: Neural sequence-to-sequence models provide a competitive approach to the task of mapping a question in natural language to an SQL query, also referred to as text-to-SQL generation. The Byte-Pair Encoding algorithm (BPE) has previously been used to improve machine translation (MT) between natural languages. In this work, we adapt BPE for text-to-SQL generation. As the datasets for this task are rather small compared to MT, we present a novel stopping criterion that prevents overfitting the BPE encoding to the training set. Additionally, we present AST BPE, which is a version of BPE that uses the Abstract Syntax Tree (AST) of the SQL statement to guide BPE merges and therefore produce BPE encodings that generalize better. We improved the accuracy of a strong attentive seq2seq baseline on five out of six English text-to-SQL tasks while reducing training time by more than 50% on four of them due to the shortened targets. Finally, on two of these tasks we exceeded previously reported accuracies.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Re-examining the Role of Schema Linking in Text-to-SQL
Wenqiang Lei,Weixin Wang,Ma Zhixin,Tian Gan,Wei Lu,Min-Yen Kan,Tat-Seng Chua +6 more
- 01 Nov 2020
TL;DR: This work provides a schema linking corpus based on the Spider text-to-SQL dataset, and builds a simple BERT-based baseline to perform a data-driven study on schema linking, finding when schema linking is done well, SLSQL demonstrates good performance on Spider despite its structural simplicity.
Recent Advances in Text-to-SQL: A Survey of What We Have and What We Expect
Naihao Deng,Yulong Chen,Yue Zhang +2 more
- 22 Aug 2022
TL;DR: A systematic survey of recent progress on text-to-SQL for datasets, methods, and evaluation is provided and it is hoped this survey can serve as quick access to existing work and motivate future research.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
•Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio +2 more
- 01 Jan 2015
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
25.7K
•Posted Content
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
20.9K
•Proceedings Article
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever,Oriol Vinyals,Quoc V. Le +2 more
- 08 Dec 2014
TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.