Open AccessPosted Content
Generating Natural Language Inference Chains.
TL;DR: A new task is proposed that measures how well a model can generate an entailed sentence from a source sentence and takes entailment-pairs of the Stanford Natural Language Inference corpus and trains an LSTM with attention, and applies this model recursively to input-output pairs, thereby generating natural language inference chains.
read more
Abstract: The ability to reason with natural language is a fundamental prerequisite for many NLP tasks such as information extraction, machine translation and question answering. To quantify this ability, systems are commonly tested whether they can recognize textual entailment, i.e., whether one sentence can be inferred from another one. However, in most NLP applications only single source sentences instead of sentence pairs are available. Hence, we propose a new task that measures how well a model can generate an entailed sentence from a source sentence. We take entailment-pairs of the Stanford Natural Language Inference corpus and train an LSTM with attention. On a manually annotated test set we found that 82% of generated sentences are correct, an improvement of 10.3% over an LSTM baseline. A qualitative analysis shows that this model is not only capable of shortening input sentences, but also inferring new statements via paraphrasing and phrase entailment. We then apply this model recursively to input-output pairs, thereby generating natural language inference chains that can be used to automatically construct an entailment graph from source sentences. Finally, by swapping source and target sentences we can also train a model that given an input sentence invents additional information to generate a new sentence.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Neural Paraphrase Generation with Stacked Residual LSTM Networks
Aaditya Prakash,Sadid A. Hasan,Kathy Lee,Vivek V. Datla,Ashequl Qadir,Joey Liu,Oladimeji Farri +6 more
- 01 Dec 2016
TL;DR: The authors proposed a stacked residual LSTM network for paraphrase generation, which adds residual connections between LSTMs layers for efficient training, and achieved state-of-the-art performance on three different datasets: PPDB, WikiAnswers and MSCOCO.
268
•Posted Content
Transforming Question Answering Datasets Into Natural Language Inference Datasets
TL;DR: This work proposes a new method for automatically deriving NLI datasets from the growing abundance of large-scale question answering datasets, and relies on learning a sentence transformation model which converts question-answer pairs into their declarative forms.
210
•Proceedings Article
Towards Text Generation with Adversarially Learned Neural Outlines
Sandeep Subramanian,sai rajeswar mudumba,Alessandro Sordoni,Adam Trischler,Aaron Courville,Chris Pal +5 more
- 01 Dec 2018
TL;DR: This article proposed a combination of autoregressive and adversarial models with the goal of learning generative models of text, which produces a high-level sentence outline and then generates words sequentially, conditioning on both the outline and the previous outputs.
Detecting and Explaining Causes From Text For a Time Series Event.
Dongyeop Kang,Varun Gangal,Ang Lu,Zheng Chen,Eduard Hovy +4 more
- 01 Sep 2017
TL;DR: This paper proposed a method based on the Granger causality of time series between features extracted from text such as N-grams, topics, sentiments, and their composition to detect causal features from text.
•Proceedings Article
Grounded Textual Entailment.
Hoa Trong Vu,Claudio Greco,Aliia Erofeeva,Somayeh Jafaritazehjan,Guido Linders,Marc Tanti,Alberto Testoni,Raffaella Bernardi,Albert Gatt +8 more
- 01 Aug 2018
TL;DR: The authors compare blind and visual-augmented models of textual entailment and show that visual information is beneficial, but also conduct an in-depth error analysis that reveals that current multimodal models are not performing "grounding" in an optimal fashion.
25
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
Bleu: a Method for Automatic Evaluation of Machine Translation
Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu +3 more
- 06 Jul 2002
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
- 01 Jan 2014
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
•Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio +2 more
- 01 Jan 2015
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
25.7K