Data Augmentation for Low-Resource Neural Machine Translation

doi:10.18653/V1/P17-2090

Open AccessProceedings Article10.18653/V1/P17-2090

Data Augmentation for Low-Resource Neural Machine Translation

Marzieh Fadaee, +2 more

- 01 May 2017

- arXiv: Computation and Language

181

TL;DR: This article proposed a data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts, which improves translation quality by up to 2.9 BLEU points over the baseline and up to 3.2BLEU over back-translation.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/D19-1670

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Jason Wei, +1 more

- 05 Mar 2019

TL;DR: This paper proposed easy data augmentation techniques for boosting performance on text classification tasks, which consists of synonym replacement, random insertion, random swap, and random deletion, and showed that EDA improves performance for both convolutional and recurrent neural networks.

...read moreread less

1.6K

•Posted Content

Beyond English-Centric Multilingual Machine Translation

Angela Fan, +16 more

- 21 Oct 2020

- arXiv: Computation and Language

TL;DR: This work creates a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages and explores how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models.

...read moreread less

849

•Posted Content

Data Augmentation by Pairing Samples for Images Classification

Hiroshi Inoue

- 15 Feb 2018

- arXiv: Learning

TL;DR: This paper introduces a simple but surprisingly effective data augmentation technique for image classification tasks, named SamplePairing, which significantly improved classification accuracy for all the tested datasets and is more valuable for tasks with a limited amount of training data, such as medical imaging tasks.

...read moreread less

486

•Posted Content

Data Augmentation for Graph Neural Networks

Tong Zhao, +5 more

- 11 Jun 2020

- arXiv: Learning

TL;DR: This work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra- class edges and demote inter-class edges in given graph structure, and introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction.

...read moreread less

421

•Book

Synthetic Data for Deep Learning

Sergey I. Nikolenko

- 26 Jun 2021

TL;DR: The synthetic-to-real domain adaptation problem that inevitably arises in applications of synthetic data is discussed, including synthetic- to-real refinement with GAN-based models and domain adaptation at the feature/model level without explicit data transformations.

...read moreread less

391

...

Expand

References

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Jan 2015

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

25.7K

•Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014

- arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

20.9K

...

Expand

Data Augmentation for Low-Resource Neural Machine Translation

Chat with Paper

AI Agents for this Paper

Citations

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Beyond English-Centric Multilingual Machine Translation

Data Augmentation by Pairing Samples for Images Classification

Data Augmentation for Graph Neural Networks

Synthetic Data for Deep Learning

References

Long short-term memory

ImageNet Classification with Deep Convolutional Neural Networks

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Neural Machine Translation by Jointly Learning to Align and Translate

Related Papers (5)

Improving Neural Machine Translation Models with Monolingual Data

Bleu: a Method for Automatic Evaluation of Machine Translation

Attention Is All You Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Neural Machine Translation by Jointly Learning to Align and Translate