Beyond English-Centric Multilingual Machine Translation

Open AccessPosted Content

Beyond English-Centric Multilingual Machine Translation

- 21 Oct 2020

846

TL;DR: This work creates a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages and explores how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2208.01448

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

Saleh Soltan, +15 more

- 02 Aug 2022

- arXiv.org

TL;DR: It is demonstrated that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more eﬃcient few-shot learners than decoder-only models on various tasks.

...read moreread less

Journal Article•10.48550/arXiv.2305.17406

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models

A. Tonja, +5 more

- 27 May 2023

TL;DR: This article used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) model, and experimented with different transfer learning setups.

...read moreread less

Proceedings Article•10.1109/aikiie60097.2023.10390173

Design of Machine Automatic Translation System Based on Artificial Intelligence

Jiao Huang, +1 more

- 02 Nov 2023

TL;DR: The experimental results show that when the semantic sample set size is 10Bit, the accuracy of the translation system is 65%, and it increases with the sample set size, but always higher than other systems, and the performance of the translation system in this article is also superior to other systems.

...read moreread less

Proceedings Article•10.1109/bibm58861.2023.10385480

Enhancing Depression Detection from Narrative Interviews Using Language Models

Palak Sood, +2 more

- 05 Dec 2023

TL;DR: A larger dataset, namely I-DAIC, is created for depression detection by integrating three existing datasets in the literature and the effectiveness, advantages, and significant potential of pre-trained language models for depression detection with narrative interviews are demonstrated.

...read moreread less

Journal Article•10.18653/v1/2023.wmt-1.49

Towards Better Evaluation for Formality-Controlled English-Japanese Machine Translation

Edison Marrese-Taylor, +2 more

TL;DR: A Transformer-based classification model for Japanese is proposed, which obtains state-of-the-art results in benchmark datasets and provides empirical evidence suggesting that prompting LLMs is a viable approach to control the formality level of En->Ja MT using LLMs.

...read moreread less

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

...

Expand

Beyond English-Centric Multilingual Machine Translation

Chat with Paper

AI Agents for this Paper

Citations

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models

Design of Machine Automatic Translation System Based on Artificial Intelligence

Enhancing Depression Detection from Narrative Interviews Using Language Models

Towards Better Evaluation for Formality-Controlled English-Japanese Machine Translation

References

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Deep Residual Learning for Image Recognition

Long short-term memory

Attention is All you Need

Related Papers (5)

Attention is All you Need

Neural Machine Translation of Rare Words with Subword Units

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Neural Machine Translation by Jointly Learning to Align and Translate

fairseq: A Fast, Extensible Toolkit for Sequence Modeling