Multi-Task Minimum Error Rate Training for SMT
TL;DR: The authors' experiments show statistically significant gains over task-specific training by techniques that model commonalities through shared parameters, however, more finegrained combinations of shared parameters with task- specific ones could not be brought to bear on models with a small number of dense features.
read more
Abstract: We present experiments on multi-task learning for discriminative training in statistical machine translation (SMT), extending standard minimum-error-rate training (MERT) by techniques that take advantage of the similarity of related tasks. We apply our techniques to German-toEnglish translation of patents from 8 tasks according to the International Patent Classification (IPC) system. Our experiments show statistically significant gains over task-specific training by techniques that model commonalities through shared parameters. However, more finegrained combinations of shared parameters with task-specific ones could not be brought to bear on models with a small number of dense features. The software used in the experiments is released as open-source tool.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Analyzing parallelism and domain similarities in the MAREC patent corpus
Katharina Wäschle,Stefan Riezler +1 more
- 02 Jul 2012
TL;DR: A twofold approach for extracting parallel data from all patent document sections from a large multilingual patent corpus and a descriptive analysis of its subdomains to enable its use in domain-oriented translation, e.g. when applying multi-task learning.
•Proceedings Article
One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation
Jonathan H. Clark,Alon Lavie,Chris Dyer +2 more
- 01 Jan 2012
TL;DR: A simple technique for incorporating domain information into a statistical machine translation system that significantly improves translation quality when test data comes from multiple domains is introduced.
•Proceedings Article
Structural and Topical Dimensions in Multi-Task Patent Translation
Katharina Waeschle,Stefan Riezler +1 more
- 23 Apr 2012
TL;DR: This paper analyzes patents along the orthogonal dimensions of topic and textual structure, and views different patent classes and different patent text sections, as separate translation tasks, and investigates the influence of such tasks on machine translation performance.
Preference Learning for Machine Translation
Patrick Simianer
- 01 Jan 2018
TL;DR: Algorithms that can learn from very large amounts of data by exploiting pairwise preferences defined over competing translations are developed, which can be used to make a machine translation system robust to arbitrary texts from varied sources, but also enable it to learn effectively to adapt to new domains of data.
6
An attractive game with the document: (im)possible?
TL;DR: The notion of crowdsourcing is reviewed, namely it is turned to crowdsourcing projects that manipulate textual data and a game on coreference, PlayCoref, and games with words and white spaces in the sentence are introduced.
References
Bleu: a Method for Automatic Evaluation of Machine Translation
Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu +3 more
- 06 Jul 2002
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Minimum Error Rate Training in Statistical Machine Translation
Franz Josef Och
- 07 Jul 2003
TL;DR: It is shown that significantly better results can often be obtained if the final evaluation criterion is taken directly into account as part of the training procedure.
3.4K
Regularized multi--task learning
Theodoros Evgeniou,Massimiliano Pontil +1 more
- 22 Aug 2004
TL;DR: An approach to multi--task learning based on the minimization of regularization functionals similar to existing ones, such as the one for Support Vector Machines, that have been successfully used in the past for single-- task learning is presented.
•Proceedings Article
Frustratingly Easy Domain Adaptation
Hal Daumé
- 01 Jun 2007
TL;DR: This work describes an approach to domain adaptation that is appropriate exactly in the case when one has enough “target” data to do slightly better than just using only “source’ data.
1.7K
•Proceedings Article
Parallelized Stochastic Gradient Descent
Martin Zinkevich,Markus Weimer,Lihong Li,Alexander J. Smola +3 more
- 06 Dec 2010
TL;DR: This paper presents the first parallel stochastic gradient descent algorithm including a detailed analysis and experimental evidence and introduces a novel proof technique — contractive mappings to quantify the speed of convergence of parameter distributions to their asymptotic limits.