Incorporating Copying Mechanism in Sequence-to-Sequence Learning

doi:10.18653/V1/P16-1154

Open AccessProceedings Article10.18653/V1/P16-1154

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

Jiatao Gu, +3 more

- 21 Mar 2016

- Vol. 1, pp 1631-1640

1.5K

TL;DR: CopyNet as discussed by the authors incorporates copying into neural network-based Seq2Seq learning and proposes a new model called CopyNet with encoder-decoder structure, which can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1109/ICRA40945.2020.9197068

Grounding Language to Landmarks in Arbitrary Outdoor Environments

Matthew Berg, +5 more

- 01 May 2020

TL;DR: This work presents a framework that parses references to landmarks, then assesses semantic similarities between the referring expression and landmarks in a predefined semantic map of the world, and ultimately translates natural language commands to motion plans for a drone.

...read moreread less

23

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.243

Posterior Control of Blackbox Generation

Xiang Lisa Li, +1 more

- 01 Jul 2020

TL;DR: The authors proposed a structured latent-variable approach to encode task-specific knowledge through a range of rich, posterior constraints that are effectively trained into the model, allowing users to ground internal model decisions based on prior knowledge, without sacrificing the representational power of neural generative models.

...read moreread less

22

•Proceedings Article•10.18653/V1/D19-1278

Semantic graph parsing with recurrent neural network DAG grammars

Federico Fancellu, +3 more

- 04 Nov 2019

TL;DR: The authors proposed a graph-aware sequence model that generates well-formed graphs while sidestepping many difficulties in graph prediction, such as the difficulty of predicting linearized graphs in semantic parsing.

...read moreread less

22

Journal Article•10.18653/v1/2022.mathnlp-1.2

Investigating Math Word Problems using Pretrained Multilingual Language Models

Minghuan Tan, +3 more

- 01 Jan 2022

TL;DR: Investigating Math Word Problems using Pretrained Multilingual Language Models TLDR: MWP solvers may not be transferred to a different language, but they can be better generalized if problem types exist on both source and target languages.

...read moreread less

22

Journal Article•10.1016/j.inffus.2023.101988

A survey on semantic processing techniques

Rui Mao, +6 more

- 01 Jan 2024

- Information Fusion

TL;DR: A survey on semantic processing techniques covering word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. The survey reviews theoretical research, advanced methods, and downstream applications. It also discusses technical and application trends, and future directions.

...read moreread less

22

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article•10.3115/V1/D14-1179

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

- 01 Jan 2014

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

28.6K

•Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Jan 2015

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

25.7K

•Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 08 Dec 2014

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

20.1K

...

Expand

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

Chat with Paper

AI Agents for this Paper

Citations

Grounding Language to Landmarks in Arbitrary Outdoor Environments

Posterior Control of Blackbox Generation

Semantic graph parsing with recurrent neural network DAG grammars

Investigating Math Word Problems using Pretrained Multilingual Language Models

A survey on semantic processing techniques

References

Deep Residual Learning for Image Recognition

Long short-term memory

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Sequence to Sequence Learning with Neural Networks

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

Bleu: a Method for Automatic Evaluation of Machine Translation

Sequence to Sequence Learning with Neural Networks

Attention is All you Need

Adam: A Method for Stochastic Optimization