TableFormer: Robust Transformer Modeling for Table-Text Encoding

doi:10.48550/arXiv.2203.00274

Proceedings Article10.48550/arXiv.2203.00274

TableFormer: Robust Transformer Modeling for Table-Text Encoding

Jingfeng Yang, +5 more

- 01 Mar 2022

pp 528-537

87

TL;DR: This work proposes a robust and structurally aware table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases, and could understand tables better due to its tabular inductive biases.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Table 3: Denotation accuracy on WTQ development and test set. Median of 5 independent runs are reported.

Table 2: Binary classification accuracy on TABFACT development and 4 splits of test set, as well as performance on test sets with our perturbation evaluation. Median of 5 independent runs are reported. Missing values are those not reported in the original paper.

Table 5: ALL questions’ cell selection accuracy of TABLEFORMER variants on SQA development set. rcgp represents the setting including row ids, column ids and global positional ids, c-gp represents column ids and global positional ids, gp represents global positional ids, and pcp represents per-cell positional ids. “SAT” represents masking out some attention scores. “SO” represents adding attention bias before scaling.

Figure 2: TABLEFORMER input and attention biases in the self attention module. This example corresponds to table (a) in Figure 1 and its paired question “query”. Different colors in the attention bias matrix denote different types of task independent biases derived based on the table structure and the associated text.

Table 7: Ablation study of proposed attention biases.

Citations

Journal Article•10.48550/arXiv.2304.13712

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Jingfeng Yang, +7 more

- 26 Apr 2023

- arXiv.org

TL;DR: Mooler et al. as mentioned in this paper presented a comprehensive and practical guide for practitioners and end-users working with large language models (LLMs) in their downstream natural language processing (NLP) tasks.

...read moreread less

302

•Proceedings Article•10.18653/v1/2022.acl-long.454

MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data

Yilun Zhao, +3 more

- 03 Jun 2022

TL;DR: A new large-scale benchmark, MultiHiertt, with QA pairs over Multi Hierarchical Tabular and Textual data is constructed and a novel QA model termed MT2Net is introduced, which first applies facts retrieving to extract relevant supporting facts from both tables and text and then uses a reasoning module to perform symbolic reasoning over retrieved facts.

...read moreread less

96

Journal Article•10.48550/arXiv.2301.13808

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

Yunhu Ye, +5 more

- 31 Jan 2023

- arXiv.org

TL;DR: The authors decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information for table reasoning, and decompose complex questions into simpler sub-questions for text reasoning.

...read moreread less

76

Proceedings Article•10.48550/arXiv.2205.09328

TransTab: Learning Transferable Tabular Transformers Across Tables

Zifeng Wang, +1 more

- 19 May 2022

TL;DR: The goal of TransTab is to convert each sample to a generalizable embedding vector, and then apply stacked transformers for feature encoding, and one methodology insight is combining column description and table cells as the raw input to a gated transformer model.

...read moreread less

71

Journal Article•10.48550/arxiv.2402.02592

Unified Training of Universal Time Series Forecasting Transformers

Gerald Woo, +5 more

- 04 Feb 2024

- arXiv.org

TL;DR: This work presents novel enhancements to the conventional time series Transformer architecture, resulting in the proposed Masked Encoder-based Universal Time Series Forecasting Transformer (Moirai), which achieves competitive or superior performance when compared to full-shot models.

...read moreread less

56

...

Expand

References

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Proceedings Article•10.18653/V1/P19-1285

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Zihang Dai, +5 more

- 09 Jan 2019

TL;DR: This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

...read moreread less

4.2K

•Posted Content

Longformer: The Long-Document Transformer

Iz Beltagy, +2 more

- 10 Apr 2020

- arXiv: Computation and Language

TL;DR: Following prior work on long-sequence transformers, the Longformer is evaluated on character-level language modeling and achieves state-of-the-art results on text8 and enwik8 and pretrain Longformer and finetune it on a variety of downstream tasks.

...read moreread less

3.9K

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.398

TaPas: Weakly Supervised Table Parsing via Pre-training

Jonathan Herzig, +4 more

- 01 Jul 2020

TL;DR: TaPas is presented, an approach to question answering over tables without generating logical forms that outperforms or rivals semantic parsing models by improving state-of-the-art accuracy on SQA and performing on par with the state of theart on WikiSQL and WikiTQ, but with a simpler model architecture.

...read moreread less

731

...

Expand