Open AccessPosted Content
TURL: Table Understanding through Representation Learning
TL;DR: This paper proposes a structure-aware Transformer encoder to model the row-column structure of relational tables, and presents a new Masked Entity Recovery objective for pre-training to capture the semantics and knowledge in large-scale unlabeled data.
read more
Abstract: Relational tables on the Web store a vast amount of knowledge Owing to the wealth of such tables, there has been tremendous progress on a variety of tasks in the area of table understanding However, existing work generally relies on heavily-engineered task specific features and model architectures In this paper, we present TURL, a novel framework that introduces the pre-training/finetuning paradigm to relational Web tables During pre-training, our framework learns deep contextualized representations on relational tables in an unsupervised manner Its universal model design with pre-trained representations can be applied to a wide range of tasks with minimal task-specific fine-tuning Specifically, we propose a structure-aware Transformer encoder to model the row-column structure of relational tables, and present a new Masked Entity Recovery (MER) objective for pre-training to capture the semantics and knowledge in large-scale unlabeled data We systematically evaluate TURL with a benchmark consisting of 6 different tasks for table understanding (eg, relation extraction, cell filling) We show that TURL generalizes well to all tasks and substantially outperforms existing methods in almost all instances
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
Tao Yu,Chien-Sheng Wu,Xi Victoria Lin,Bailin Wang,Yi Chern Tan,Xinyi Yang,Dragomir R. Radev,Richard Socher,Caiming Xiong +8 more
TL;DR: GraPPa is an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data and significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.
203
TABBIE: Pretrained Representations of Tabular Data
Hiroshi Iida,Dung Thai,Varun Manjunatha,Mohit Iyyer +3 more
- 06 May 2021
TL;DR: A simple pretraining objective (corrupt cell detection) is devised that learns exclusively from tabular data and reaches the state-of-the-art on a suite of table-based prediction tasks and requires far less compute to train.
TableFormer: Robust Transformer Modeling for Table-Text Encoding
Jingfeng Yang,Aditya Gupta,Shyam Upadhyay,Luheng He,Rahul Goel,Shachi Paul +5 more
- 01 Mar 2022
TL;DR: This work proposes a robust and structurally aware table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases, and could understand tables better due to its tabular inductive biases.
TransTab: Learning Transferable Tabular Transformers Across Tables
Zifeng Wang,Jimeng Sun +1 more
- 19 May 2022
TL;DR: The goal of TransTab is to convert each sample to a generalizable embedding vector, and then apply stacked transformers for feature encoding, and one methodology insight is combining column description and table cells as the raw input to a gated transformer model.
71
•Posted Content
RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation
TL;DR: RPT, a denoising autoencoder for tuple-to-X models, is presented, a Transformer-based neural translation architecture that consists of a bidirectional encoder and a left- to-right autoregressive decoder leading to a generalization of both BERT and GPT.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
Glove: Global Vectors for Word Representation
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
•Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov,Ilya Sutskever,Kai Chen,Greg S. Corrado,Jeffrey Dean +4 more
- 05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.