TURL: Table Understanding through Representation Learning

Open AccessPosted Content

TURL: Table Understanding through Representation Learning

- 26 Jun 2020

136

TL;DR: This paper proposes a structure-aware Transformer encoder to model the row-column structure of relational tables, and presents a new Masked Entity Recovery objective for pre-training to capture the semantics and knowledge in large-scale unlabeled data.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Tao Yu, +8 more

- 29 Sep 2020

- arXiv: Computation and Language

TL;DR: GraPPa is an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data and significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.

...read moreread less

203

•Proceedings Article•10.18653/V1/2021.NAACL-MAIN.270

TABBIE: Pretrained Representations of Tabular Data

Hiroshi Iida, +3 more

- 06 May 2021

TL;DR: A simple pretraining objective (corrupt cell detection) is devised that learns exclusively from tabular data and reaches the state-of-the-art on a suite of table-based prediction tasks and requires far less compute to train.

...read moreread less

182

Proceedings Article•10.48550/arXiv.2203.00274

TableFormer: Robust Transformer Modeling for Table-Text Encoding

Jingfeng Yang, +5 more

- 01 Mar 2022

TL;DR: This work proposes a robust and structurally aware table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases, and could understand tables better due to its tabular inductive biases.

...read moreread less

87

Proceedings Article•10.48550/arXiv.2205.09328

TransTab: Learning Transferable Tabular Transformers Across Tables

Zifeng Wang, +1 more

- 19 May 2022

TL;DR: The goal of TransTab is to convert each sample to a generalizable embedding vector, and then apply stacked transformers for feature encoding, and one methodology insight is combining column description and table cells as the raw input to a gated transformer model.

...read moreread less

71

•Posted Content

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

Nan Tang, +7 more

- 04 Dec 2020

- arXiv: Learning

TL;DR: RPT, a denoising autoencoder for tuple-to-X models, is presented, a Transformer-based neural translation architecture that consists of a bidirectional encoder and a left- to-right autoregressive decoder leading to a generalization of both BERT and GPT.

...read moreread less

60

...

Expand

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

Proceedings Article•10.3115/V1/D14-1162

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

- 01 Oct 2014

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

41.6K

•Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 05 Dec 2013

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

24.1K