Multi-Granular Text Encoding for Self-Explaining Categorization

doi:10.18653/V1/W19-4805

Open AccessProceedings Article10.18653/V1/W19-4805

Multi-Granular Text Encoding for Self-Explaining Categorization

Zhiguo Wang, +7 more

- 01 Jul 2019

- pp 41-45

12

TL;DR: This work defines multi-granular ngrams as basic units for explanation, and organizes all n Grammars into a hierarchical structure, so that shorter n grams can be reused while computing longer n Grammar, and can extract intuitive multi- granular evidence to support its predictions.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.35

Evaluating Explanation Methods for Neural Machine Translation

Jierui Li, +5 more

- 04 May 2020

TL;DR: This article proposed a principled metric based on fidelity in regard to the predictive behavior of the NMT model, and quantitatively evaluated several explanation methods in terms of the proposed metric and reveal some valuable findings for these explanation methods.

...read moreread less

24

•Posted Content

Evaluating Explanation Methods for Neural Machine Translation

Jierui Li, +5 more

- 04 May 2020

- arXiv: Computation and Language

TL;DR: An initial attempt to evaluate explanation methods from an alternative viewpoint and proposes a principled metric based on fidelity in regard to the predictive behavior of the NMT model.

...read moreread less

16

•Posted Content

Local Interpretations for Explainable Natural Language Processing: A Survey.

Siwen Luo, +3 more

- 20 Mar 2021

- arXiv: Computation and Language

TL;DR: This article investigated various methods to improve the interpretability of deep neural networks for natural language processing (NLP) tasks, including machine translation and sentiment analysis, and provided a comprehensive discussion on the definition of the term ''interpretability'' and its various aspects at the beginning of this work.

...read moreread less

14

•Posted Content

Open-Retrieval Conversational Machine Reading

Yifan Gao, +3 more

- 17 Feb 2021

- arXiv: Computation and Language

TL;DR: The authors proposed a multi-passage Discourse-aware Entailment Reasoning Network (MUDERN), which extracts conditions in the rule texts through discourse segmentation, conducts multipassage entailment reasoning to answer user questions directly, or asks clarification follow-up questions to inquiry more information.

...read moreread less

12

Review•10.1145/3649450

Local Interpretations for Explainable Natural Language Processing: A Survey

Siwen Luo, +3 more

- 25 Apr 2024

- ACM Computing Surveys

TL;DR: Local interpretations for Explainable NLP models survey local interpretation techniques for improving the interpretability of deep neural networks for NLP tasks.

...read moreread less

5

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Posted Content

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 22 Dec 2014

- arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

82.5K

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

Preprint•10.48550/arxiv.1706.03762

Attention Is All You Need

Ashish Vaswani, +7 more

- 01 Jan 2017

Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

...read moreread less

51.8K