Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Jiatao Gu,Zhengdong Lu,Hang Li,Victor O. K. Li +3 more
- 21 Mar 2016
- Vol. 1, pp 1631-1640
TL;DR: CopyNet as discussed by the authors incorporates copying into neural network-based Seq2Seq learning and proposes a new model called CopyNet with encoder-decoder structure, which can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.
read more
Abstract: We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The challenge with regard to copying in Seq2Seq is that new machinery is needed to decide when to perform the operation. In this paper, we incorporate copying into neural network-based Seq2Seq learning and propose a new model called CopyNet with encoder-decoder structure. CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence. Our empirical study on both synthetic data sets and real world data sets demonstrates the efficacy of CopyNet. For example, CopyNet can outperform regular RNN-based model with remarkable margins on text summarization tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Break It Down: A Question Understanding Benchmark
Tomer Wolfson,Mor Geva,Ankit Gupta,Matt Gardner,Yoav Goldberg,Yoav Goldberg,Daniel Deutch,Jonathan Berant +7 more
TL;DR: This work introduces a Question Decomposition Meaning Representation (QDMR) for questions, and demonstrates the utility of QDMR by showing that it can be used to improve open-domain question answering on the HotpotQA dataset, and can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications.
21
Neural abstractive summarization fusing by global generative topics
TL;DR: This work proposes to incorporate a neural generative topic matrix as an abstractive level of topic information into a summarization generation system that is capable of generating succinct and recapitulative words or phrases.
21
Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning
TL;DR: The novel model Abstractive review Summarization with Topic modeling and Reinforcement deep learning (ASTR) leverages the benefits of the supervised deep neural networks, reinforcement learning, and unsupervised probabilistic generative model to strengthen the aspect/sentiment-aware review representation learning.
21
Softregex: Generating regex from natural language descriptions using softened regex equivalence
Park Jun U,Sang-Ki Ko,Marco Cognetta,Marco Cognetta,Yo-Sub Han +4 more
- 01 Nov 2019
TL;DR: A new regex generation model, SoftRegex, is proposed, us-ing the EQ_Reg model, and it is empirically demonstrated that SoftRe regex substantially reduces the training time and produces state-of-the-art results on three benchmark datasets.
21
Translate and label! An encoder-decoder approach for cross-lingual semantic role labeling
Angel Daza,Anette Frank +1 more
- 01 Nov 2019
TL;DR: This article proposed a cross-lingual encoder-decoder model that simultaneously translates and generates sentences with SRL annotations in a resource-poor target language, but their model does not need parallel data during inference time.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
- 01 Jan 2014
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
•Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio +2 more
- 01 Jan 2015
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
25.7K
•Proceedings Article
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever,Oriol Vinyals,Quoc V. Le +2 more
- 08 Dec 2014
TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Related Papers (5)
Ilya Sutskever,Oriol Vinyals,Quoc V. Le +2 more
- 08 Dec 2014
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015