Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Workshop on Statistical Machine Translation
  4. 2015
  1. Home
  2. Conferences
  3. Workshop on Statistical Machine Translation
  4. 2015
Showing papers presented at "Workshop on Statistical Machine Translation in 2015"
Proceedings Article•10.18653/V1/W15-3049•
chrF: character n-gram F-score for automatic MT evaluation

[...]

Maja Popović1•
Humboldt University of Berlin1
1 Sep 2015
TL;DR: The proposed use of character n-gram F-score for automatic evaluation of machine translation output shows very promising results, especially for the CHRF3 score – for translation from English, this variant showed the highest segment-level correlations outperforming even the best metrics on the WMT14 shared evaluation task.
Abstract: We propose the use of character n-gram F-score for automatic evaluation of machine translation output. Character ngrams have already been used as a part of more complex metrics, but their individual potential has not been investigated yet. We report system-level correlations with human rankings for 6-gram F1-score (CHRF) on the WMT12, WMT13 and WMT14 data as well as segment-level correlation for 6gram F1 (CHRF) and F3-scores (CHRF3) on WMT14 data for all available target languages. The results are very promising, especially for the CHRF3 score – for translation from English, this variant showed the highest segment-level correlations outperforming even the best metrics on the WMT14 shared evaluation task.

1,392 citations

Proceedings Article•10.18653/V1/W15-3001•
Findings of the 2015 Workshop on Statistical Machine Translation

[...]

Ondřej Bojar1, Rajen Chatterjee2, Christian Federmann2, Barry Haddow, Matthias Huck, Chris Hokamp3, Philipp Koehn, Varvara Logacheva3, Christof Monz4, Matteo Negri5, Matt Post6, Carolina Scarton3, Lucia Specia3, Marco Turchi5 •
Charles University in Prague1, University of Edinburgh2, University of Sheffield3, University of Amsterdam4, fondazione bruno kessler5, Johns Hopkins University6
1 Sep 2015
TL;DR: The WMT15 shared task as discussed by the authors included a standard news translation task, a metrics task, tuning task, and a task for run-time estimation of machine translation quality, and an automatic post-editing task.
Abstract: This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries.

379 citations

Proceedings Article•10.18653/V1/W15-3031•
Results of the WMT15 Metrics Shared Task

[...]

Miloš Stanojević, Amir Kamran1, Philipp Koehn, Ondřej Bojar2•
University of Amsterdam1, Charles University in Prague2
1 Sep 2015
TL;DR: This paper presents the results of the WMT15 Metrics Shared Task, which asked participants of this task to score the outputs of the MT systems involved in the W MT15 Shared Translation Task to evaluate system level correlation and segment level correlation.
Abstract: This paper presents the results of the WMT15 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT15 Shared Translation Task. We collected scores of 46 metrics from 11 research groups. In addition to that, we computed scores of 7 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system level correlation (how well each metric’s scores correlate with WMT15 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence).

162 citations

Proceedings Article•
Proceedings of the Tenth Workshop on Statistical Machine Translation

[...]

Ondřej Bojar1, Rajan Chatterjee, Christian Federmann2, Barry Haddow2, Chris Hokamp3, Matthias Huck2, Varvara Logacheva4, Pavel Pecina1 •
Charles University in Prague1, University of Edinburgh2, Dublin City University3, University of Sheffield4
1 Jan 2015

43 citations

Proceedings Article•10.18653/V1/W15-3025•
The FBK Participation in the WMT15 Automatic Post-editing Shared Task

[...]

Rajen Chatterjee1, Marco Turchi2, Matteo Negri2•
University of Edinburgh1, fondazione bruno kessler2
1 Sep 2015
TL;DR: This paper describes the “FBK EnglishSpanish Automatic Post-editing (APE)” systems submitted to the APE shared task at the WMT 2015 and introduces some novel task-specific dense features through which improvements over the default setup of these approaches are observed.
Abstract: In this paper, we describe the “FBK EnglishSpanish Automatic Post-editing (APE)” systems submitted to the APE shared task at the WMT 2015. We explore the most widely used statistical APE technique (monolingual) and its most significant variant (context-aware). In this exploration, we introduce some novel task-specific dense features through which we observe improvements over the default setup of these approaches. We show these features are useful to prune the phrase table in order to remove unreliable rules and help the decoder to select useful translation options during decoding. Our primary APE system submitted at this shared task performs significantly better than the standard APE baseline.

35 citations

Proceedings Article•10.18653/V1/W15-3050•
BEER 1.1: ILLC UvA submission to metrics and tuning task

[...]

Miloš Stanojević1, Khalil Sima'an1•
University of Amsterdam1
1 Sep 2015
TL;DR: The main changes introduced this year are: extending the learning-to-rank trained sentence level metric to the corpus level, incorporating syntactic ingredients based on dependency trees, and a technique for finding parameters of BEER that avoid “gaming of the metric” during tuning.
Abstract: We describe the submissions of ILLC UvA to the metrics and tuning tasks on WMT15. Both submissions are based on the BEER evaluation metric originally presented on WMT14 (Stanojevic and Sima’an, 2014a). The main changes introduced this year are: (i) extending the learning-to-rank trained sentence level metric to the corpus level (but still decomposable to sentence level), (ii) incorporating syntactic ingredients based on dependency trees, and (iii) a technique for finding parameters of BEER that avoid “gaming of the metric” during tuning.

28 citations

Proceedings Article•10.18653/V1/W15-3041•
SHEF-NN: Translation Quality Estimation with Neural Networks

[...]

Kashif Shah1, Varvara Logacheva1, Gustavo Paetzold1, Frédéric Blain1, Daniel Beck1, Fethi Bougares2, Lucia Specia3 •
University of Sheffield1, University of Maine2, Dublin City University3
1 Sep 2015
TL;DR: The authors' systems outperform the baseline as well as many other submissions for Tasks 1 and 2 of the WMT15 Shared Task on Quality Estimation and the best performing system (SHEF-W2V) only uses features learned in an unsupervised fashion.
Abstract: We describe our systems for Tasks 1 and 2 of the WMT15 Shared Task on Quality Estimation. Our submissions use (i) a continuous space language model to extract additional features for Task 1 (SHEFGP, SHEF-SVM), (ii) a continuous bagof-words model to produce word embeddings as features for Task 2 (SHEF-W2V) and (iii) a combination of features produced by QuEst++ and a feature produced with word embedding models (SHEFQuEst++). Our systems outperform the baseline as well as many other submissions. The results are especially encouraging for Task 2, where our best performing system (SHEF-W2V) only uses features learned in an unsupervised fashion.

26 citations

Proceedings Article•10.18653/V1/W15-3013•
The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT~2015

[...]

Barry Haddow, Matthias Huck, Alexandra Birch, Nikolay Bogoychev1, Philipp Koehn •
University of Edinburgh1
1 Sep 2015
TL;DR: This paper set up phrase-based statistical machine translation systems for all ten language pairs of this year’s evaluation campaign, which are English paired with Czech, Finnish, French, German, and Russian in both translation directions.
Abstract: This paper describes the submission of the University of Edinburgh and the Johns Hopkins University for the shared translation task of the EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015). We set up phrase-based statistical machine translation systems for all ten language pairs of this year’s evaluation campaign, which are English paired with Czech, Finnish, French, German, and Russian in both translation directions. Novel research directions we investigated include: neural network language models and bilingual neural network language models, a comprehensive use of word classes, and sparse lexicalized reordering features.

26 citations

Proceedings Article•10.18653/V1/W15-3059•
How do Humans Evaluate Machine Translation

[...]

Francisco Guzmán1, Ahmed Abdelali1, Irina Temnikova1, Hassan Sajjad1, Stephan Vogel1 •
Qatar Foundation1
1 Sep 2015
TL;DR: This paper takes a closer look at the MT evaluation process from a glass-box perspective using eye-tracking and suggests that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.
Abstract: In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task ‐ the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.

20 citations

Proceedings Article•10.18653/V1/W15-3047•
Machine Translation Evaluation using Recurrent Neural Networks

[...]

Rohit Gupta1, Constantin Orasan1, Josef van Genabith2•
University of Wolverhampton1, German Research Centre for Artificial Intelligence2
1 Sep 2015
TL;DR: A metric based on dense vector spaces and Long Short Term Memory networks, which are types of Recurrent Neural Networks (RNNs), is submitted in the WMT-15 metrics task and is the best performing metric overall according to Spearman and Pearson and second best according to Pearson (TrueSkill) system level correlation.
Abstract: This paper presents our metric (UoWLSTM) submitted in the WMT-15 metrics task. Many state-of-the-art Machine Translation (MT) evaluation metrics are complex, involve extensive external resources (e.g. for paraphrasing) and require tuning to achieve the best results. We use a metric based on dense vector spaces and Long Short Term Memory (LSTM) networks, which are types of Recurrent Neural Networks (RNNs). For WMT15 our new metric is the best performing metric overall according to Spearman and Pearson (Pre-TrueSkill) and second best according to Pearson (TrueSkill) system level correlation.

20 citations

Proceedings Article•10.18653/V1/W15-3026•
USAAR-SAPE: An English--Spanish Statistical Automatic Post-Editing System

[...]

Santanu Pal1, Mihaela Vela1, Sudip Kumar Naskar2, Josef van Genabith3•
Saarland University1, Jadavpur University2, German Research Centre for Artificial Intelligence3
1 Sep 2015
TL;DR: The USAAR-SAPE English‐ Spanish Automatic Post-Editing (APE) system submitted to the APE Task organized in the Workshop on Statistical Machine Translation (WMT) in 2015 was able to improve upon the baseline MT system output by incorporating Phrase-Based Statistical MT (PBSMT) technique into the monolingual Statistical APE task (SAPE).
Abstract: We describe the USAAR-SAPE English‐ Spanish Automatic Post-Editing (APE) system submitted to the APE Task organized in the Workshop on Statistical Machine Translation (WMT) in 2015. Our system was able to improve upon the baseline MT system output by incorporating Phrase-Based Statistical MT (PBSMT) technique into the monolingual Statistical APE task (SAPE). The reported final submission crucially involves hybrid word alignment. The SAPE system takes raw Spanish Machine Translation (MT) output provided by the shared task organizers and produces post-edited Spanish text. The parallel data consist of English Text, raw machine translated Spanish output, and their corresponding manually post-edited versions. The major goal of the task is to reduce the post-editing effort by improving the quality of the MT output in terms of fluency and adequacy.
Proceedings Article•10.18653/V1/W15-3052•
LeBLEU: N-gram-based Translation Evaluation Score for Morphologically Complex Languages

[...]

Sami Virpioja1, Stig-Arne Grönroos1•
Helsinki University of Technology1
1 Sep 2015
TL;DR: The results on WMT data sets show that fuzzy n-gram matching improves correlations to human evaluation especially for highly compounding languages.
Abstract: This paper describes the LeBLEU evaluation score for machine translation, submitted to WMT15 Metrics Shared Task. LeBLEU extends the popular BLEU score to consider fuzzy matches between word n-grams. While there are several variants of BLEU that allow to non-exact matches between words either by character-based distance measures or morphological preprocessing, none of them use fuzzy comparison between longer chunks of text. The results on WMT data sets show that fuzzy n-gram matching improves correlations to human evaluation especially for highly compounding languages.
Proceedings Article•10.18653/V1/W15-3051•
Predicting Machine Translation Adequacy with Document Embeddings

[...]

Mihaela Vela1, Liling Tan1•
Saarland University1
1 Sep 2015
TL;DR: The approach presented here is learning a Bayesian Ridge Regressor using document skip-gram embeddings in order to automatically evaluate Machine Translation (MT) output by predicting semantic adequacy scores.
Abstract: This paper describes USAAR’s submission to the the metrics shared task of the Workshop on Statistical Machine Translation (WMT) in 2015. The goal of our submission is to take advantage of the semantic overlap between hypothesis and reference translation for predicting MT output adequacy using language independent document embeddings. The approach presented here is learning a Bayesian Ridge Regressor using document skip-gram embeddings in order to automatically evaluate Machine Translation (MT) output by predicting semantic adequacy scores. The evaluation of our submission ‐ measured by the correlation with human judgements ‐ shows promising results on system-level scores.
Proceedings Article•10.18653/V1/W15-3036•
UAlacant word-level machine translation quality estimation system at WMT 2015

[...]

Miquel Esplà-Gomis1, Felipe Sánchez-Martínez1, Mikel L. Forcada1•
University of Alicante1
1 Sep 2015
TL;DR: The Universitat d’Alacant submissions for the machine translation quality estimation (MTQE) shared task in WMT 2015 is described, where they participated in the wordlevel MTQE sub-task.
Abstract: This paper describes the Universitat d’Alacant submissions (labelled as UAlacant) for the machine translation quality estimation (MTQE) shared task in WMT 2015, where we participated in the wordlevel MTQE sub-task. The method we used to produce our submissions uses external sources of bilingual information as a black box to spot sub-segment correspondences between a source segmentS and the translation hypothesisT produced by a machine translation system. This is done by segmenting bothS andT into overlapping subsegments of variable length and translating them in both translation directions, using the available sources of bilingual information on the fly. For our submissions, two sources of bilingual information were used: machine translation (Apertium and Google Translate) and the bilingual concordancer Reverso Context. After obtaining the subsegment correspondences, a collection of features is extracted from them, which are then used by a binary classifer to obtain the final “GOOD” or “BAD” word-level quality labels. We prepared two submissions for this year’s edition of WMT 2015: one using the features produced by our system, and one combining them with the baseline features published by the organisers of the task, which were ranked third and first for the sub-task, respectively.
Proceedings Article•10.18653/V1/W15-3005•
ParFDA for Fast Deployment of Accurate Statistical Machine Translation Systems, Benchmarks, and Statistics

[...]

Ergun Bicici1, Qun Liu1, Andy Way1•
Dublin City University1
17 Sep 2015
TL;DR: ParFDA is a parallel implementation of feature decay algorithms (FDA) developed for fast deploy and results close to the top with an average of 3.176 BLEU points difference using significantly less resources for building SMT systems.
Abstract: We build parallel FDA5 (ParFDA) Moses statistical machine translation (SMT) systems for all language pairs in the workshop on statistical machine translation (Bojar et al., 2015) (WMT15) translation task and obtain results close to the top with an average of 3.176 BLEU points difference using significantly less resources for building SMT systems. ParFDA is a parallel implementation of feature decay algorithms (FDA) developed for fast deploy
Proceedings Article•10.18653/V1/W15-3017•
UdS-Sant: English--German Hybrid Machine Translation System

[...]

Santanu Pal1, Sudip Kumar Naskar2, Josef van Genabith3•
Saarland University1, Jadavpur University2, German Research Centre for Artificial Intelligence3
1 Sep 2015
TL;DR: This paper describes the UdS-Sant English‐German Hybrid Machine Translation system submitted to the Translation Task organized in the Workshop on Statistical Machine Translation (WMT) 2015 and brings improvements over the baseline system by incorporating additional knowledge such as extracted bilingual named entities and bilingual phrase pairs induced from example-based methods.
Abstract: This paper describes the UdS-Sant English‐German Hybrid Machine Translation (MT) system submitted to the Translation Task organized in the Workshop on Statistical Machine Translation (WMT) 2015. Our proposed hybrid system brings improvements over the baseline system by incorporating additional knowledge such as extracted bilingual named entities and bilingual phrase pairs induced from example-based methods. The reported final submission is the result of a hybrid system obtained from confusion network based system combination that combines the best performance of each individual system in a multi-engine pipeline.
Proceedings Article•10.18653/V1/W15-3043•
UGENT-LT3 SCATE System for Machine Translation Quality Estimation

[...]

Arda Tezcan1, Veronique Hoste1, Bart Desmet1, Lieve Macken1•
Ghent University1
1 Sep 2015
TL;DR: This paper describes the submission of the UGENT-LT3 SCATE system to the WMT15 Shared Task on Quality Estimation (QE), viz.
Abstract: This paper describes the submission of the UGENT-LT3 SCATE system to the WMT15 Shared Task on Quality Estimation (QE), viz. English-Spanish word and sentence-level QE. We conceived QE as a supervised Machine Learning (ML) problem and designed additional features and combined these with the baseline feature set to estimate quality. The sentence-level QE system re-uses the word level predictions of the word-level QE system. We experimented with different learning methods and observe improvements over the baseline system for wordlevel QE with the use of the new features and by combining learning methods into ensembles. For sentence-level QE we show that using a single feature based on word-level predictions can perform better than the baseline system and using this in combination with additional features led to further improvements in performance.
Proceedings Article•10.18653/V1/W15-3022•
Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling

[...]

Raphael Rubino1, Tommi A. Pirinen1, Miquel Esplà-Gomis2, Nikola Ljubešić3, Sergio Ortiz Rojas, Vassilis Papavassiliou1, Prokopis Prokopidis, Antonio Toral1 •
Dublin City University1, University of Alicante2, University of Zagreb3
1 Sep 2015
TL;DR: This paper presents the machine translation systems submitted by the Abu-MaTran project for the Finnish‐English language pair at the WMT 2015 translation task, which are the top performing English-to-Finnish unconstrained (all automatic metrics) and constrained (BLEU), and Finnish- to-English constrained (TER) systems.
Abstract: This paper presents the machine translation systems submitted by the Abu-MaTran project for the Finnish‐English language pair at the WMT 2015 translation task. We tackle the lack of resources and complex morphology of the Finnish language by (i) crawling parallel and monolingual data from the Web and (ii) applying rule-based and unsupervised methods for morphological segmentation. Several statistical machine translation approaches are evaluated and then combined to obtain our final submissions, which are the top performing English-to-Finnish unconstrained (all automatic metrics) and constrained (BLEU), and Finnish-to-English constrained (TER) systems.
Proceedings Article•10.18653/V1/W15-3003•
Data Selection With Fewer Words

[...]

Amittai Axelrod1, Philip Resnik1, Xiaodong He2, Mari Ostendorf3•
University of Maryland, College Park1, Microsoft2, University of Washington3
1 Sep 2015
TL;DR: This work presents a method that improves data selection by combining a hybrid word/part-of-speech representation for corpora, with the idea of distinguishing between rare and frequent events.
Abstract: We present a method that improves data selection by combining a hybrid word/part-of-speech representation for corpora, with the idea of distinguishing between rare and frequent events. We validate our approach using data selection for machine translation, and show that it maintains or improves BLEU and TER translation scores while substantially improving vocabulary coverage and reducing data selection model size. Paradoxically, the coverage improvement is achieved by abstracting away over 97% of the total training corpus vocabulary using simple part-of-speech tags during the data selection process.
Proceedings Article•10.18653/V1/W15-3035•
Referential Translation Machines for Predicting Translation Quality and Related Statistics

[...]

Ergun Bicici1, Qun Liu1, Andy Way1•
Dublin City University1
17 Sep 2015
TL;DR: It is shown that referential translation machines pioneer a language independent approach to all similarity tasks and remove the need to access any task or domain specific information or resource.
Abstract: We use referential translation machines (RTMs) for predicting translation performance. RTMs pioneer a language independent approach to all similarity tasks and remove the need to access any task or domain specific information or resource. We improve our RTM models with the
Proceedings Article•10.18653/V1/W15-3027•
Why Predicting Post-Edition is so Hard? Failure Analysis of LIMSI Submission to the APE Shared Task

[...]

Guillaume Wisniewski1, Nicolas Pécheux2, François Yvon3•
University of Paris-Sud1, Université Paris-Saclay2, Centre national de la recherche scientifique3
1 Sep 2015
TL;DR: It is shown, by carefully analyzing the failure of the two systems submitted by LIMSI to the WMT’15 Shared Task on Automatic Post-Editing, that this counterperformance mainly results from the inconsistency in the annotations.
Abstract: This paper describes the two systems submitted by LIMSI to the WMT’15 Shared Task on Automatic Post-Editing. The first one relies on a reformulation of the APE task as a Machine Translation task; the second implements a simple rule-based approach. Neither of these two systems manage to improve the automatic translation. We show, by carefully analyzing the failure of our systems that this counterperformance mainly results from the inconsistency in the annotations.
Proceedings Article•10.18653/V1/W15-3018•
The RWTH Aachen German-English Machine Translation System for WMT 2015

[...]

Jan-Thorsten Peter1, Farzad Toutounchi, Joern Wuebker1, Hermann Ney1•
RWTH Aachen University1
1 Sep 2015
TL;DR: This paper describes the statistical machine translation system developed at RWTH Aachen University for the German!English translation task of the EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015).
Abstract: This paper describes the statistical machine translation system developed at RWTH Aachen University for the German!English translation task of the EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015). A phrase-based machine translation system was applied and augmented with hierarchical phrase reordering and word class language models. Further, we ran discriminative maximum expected BLEU training for our system. In addition, we utilized multiple feed-forward neural network language and translation models and a recurrent neural network language model for reranking.
Proceedings Article•10.18653/V1/W15-3016•
LIMSI$@$WMT'15 : Translation Task

[...]

Benjamin Marie, Alexandre Allauzen1, Franck Burlot, Quoc-Khanh Do, Julia Ive1, Elena Knyazeva, Matthieu Labeau, Thomas Lavergne2, Kevin Löser, Nicolas Pécheux, François Yvon •
Université Paris-Saclay1, Franche Comté Électronique Mécanique Thermique et Optique Sciences et Technologies2
1 Sep 2015
TL;DR: LIMSI’s submissions to the shared WMT’15 translation task are described, including a tailored normalization of Russian to translate into English, and a two-step process to translate first into simplified Russian, followed by a conversion into inflected Russian.
Abstract: This paper describes LIMSI’s submissions to the shared WMT’15 translation task. We report results for French-English, Russian-English in both directions, as well as for Finnish-into-English. Our submissions use NCODE and MOSES along with continuous space translation models in a post-processing step. The main novelties of this year’s participation are the following: for Russian-English, we investigate a tailored normalization of Russian to translate into English, and a two-step process to translate first into simplified Russian, followed by a conversion into inflected Russian. For French-English, the challenge is domain adaptation, for which only monolingual corpora are available. Finally, for the Finnish-to-English task, we explore unsupervised morphological segmentation to reduce the sparsity of data induced by the rich morphology on the Finnish side.
Proceedings Article•10.18653/V1/W15-3038•
LORIA System for the WMT15 Quality Estimation Shared Task

[...]

David Langlois
17 Sep 2015
TL;DR: This paper proposes to increase the size of the training corpus by using the post-edited and reference corpora during the training step and performs a linear regression of the feature space against scores in the range [0..1].
Abstract: We describe our system for WMT2015 Shared Task on Quality Estimation, task 1, sentence-level prediction of post-edition effort. We use baseline features, Latent Semantic Indexing based features and features based on pseudo-references. SVM algorithm allows to estimate the linear regression between the features vectors and the HTER score. We use a selection algorithm in order to put aside needless features. Our best system leads to a performance in terms of Mean Absolute Error equal to 13.34 on official test while the official baseline system leads to a performance equal to 14.82.
Proceedings Article•10.18653/V1/W15-3048•
Alignment-based sense selection in METEOR and the RATATOUILLE recipe

[...]

Benjamin Marie, Marianna Apidianaki
1 Sep 2015
TL;DR: It is shown that context-sensitive synonym selection increases the correlation of the Meteor metric with human judgments of translation quality on the WMT14 data.
Abstract: This paper describes Meteor-WSD and RATATOUILLE, the LIMSI submissions to the WMT15 metrics shared task. MeteorWSD extends synonym mapping to languages other than English based on alignments and gives credit to semantically adequate translations in context. We show that context-sensitive synonym selection increases the correlation of the Meteor metric with human judgments of translation quality on the WMT14 data. RATATOUILLE combines MeteorWSD with nine other metrics for evaluation and outperforms the best metric (BEER) involved in its computation.
Proceedings Article•10.18653/V1/W15-3030•
ListNet-based MT Rescoring

[...]

Jan Niehues1, Quoc-Khanh Do, Alexandre Allauzen2, Alex Waibel1•
Karlsruhe Institute of Technology1, Université Paris-Saclay2
1 Sep 2015
TL;DR: This work presents a new technique to train the log-linear model based on the ListNet algorithm that scales to many features, considers the whole list and not single entries during learning and can also be applied to more complex models than a log- linear combination.
Abstract: The log-linear combination of different features is an important component of SMT systems. It allows for the easy integartion of models into the system and is used during decoding as well as for nbest list rescoring. With the recent success of more complex models like neural network-based translation models, n-best list rescoring attracts again more attention. In this work, we present a new technique to train the log-linear model based on the ListNet algorithm. This technique scales to many features, considers the whole list and not single entries during learning and can also be applied to more complex models than a log-linear combination. Using the new learning approach, we improve the translation quality of a largescale system by 0.8 BLEU points during rescoring and generate translations which are up to 0.3 BLEU points better than other learning techniques such as MERT or MIRA.
Proceedings Article•10.18653/V1/W15-3021•
Morphological Segmentation and OPUS for Finnish-English Machine Translation

[...]

Jörg Tiedemann1, Filip Ginter2, Jenna Kanerva2•
Uppsala University1, University of Turku2
1 Sep 2015
TL;DR: B baseline systems for Finnish-English and English-Finnish machine translation using standard phrasebased and factored models including morphological features are described and the effectiveness of morphological pre-processing of Finnish is demonstrated.
Abstract: This paper describes baseline systems for Finnish-English and English-Finnish machine translation using standard phrasebased and factored models including morphological features. We experiment with compound splitting and morphological segmentation and study the effect of adding noisy out-of-domain data to the parallel and the monolingual training data. Our results stress the importance of training data and demonstrate the effectiveness of morphological pre-processing of Finnish.
Proceedings Article•10.18653/V1/W15-3032•
Results of the WMT15 Tuning Shared Task

[...]

Miloš Stanojević, Amir Kamran1, Ondřej Bojar2•
University of Amsterdam1, Charles University in Prague2
1 Sep 2015
TL;DR: This paper presents the results of the WMT15 Tuning Shared Task, which provided the participants of this task with a complete machine translation system and asked them to tune its internal parameters (feature weights).
Abstract: This paper presents the results of the WMT15 Tuning Shared Task. We provided the participants of this task with a complete machine translation system and asked them to tune its internal parameters (feature weights). The tuned systems were used to translate the test set and the outputs were manually ranked for translation quality. We received 4 submissions in the English-Czech and 6 in the Czech-English translation direction. In addition, we ran 3 baseline setups, tuning the parameters with standard optimizers for BLEU score.
Proceedings Article•10.18653/V1/W15-3004•
DFKI's experimental hybrid MT system for WMT 2015

[...]

Eleftherios Avramidis1, Maja Popović2, Aljoscha Burchardt1•
German Research Centre for Artificial Intelligence1, Humboldt University of Berlin2
1 Sep 2015
TL;DR: DFKI participated in the shared translation task of WMT 2015 with the GermanEnglish language pair in each translation direction using an experimental hybrid system based on three systems: a statistical Moses system, a commercial rule-based system, and a serial coupling of the two.
Abstract: DFKI participated in the shared translation task of WMT 2015 with the GermanEnglish language pair in each translation direction. The submissions were generated using an experimental hybrid system based on three systems: a statistical Moses system, a commercial rule-based system, and a serial coupling of the two where the output of the rule-based system is further translated by Moses trained on parallel text consisting of the rule-based output and the original target language. The outputs of three systems are combined using two methods: (a) an empirical selection mechanism based on grammatical features (primary submission) and (b) IBM1 models based on POS 4-grams (contrastive submission).
Proceedings Article•10.18653/V1/W15-3060•
Local System Voting Feature for Machine Translation System Combination

[...]

Markus Freitag1, Jan-Thorsten Peter1, Stephan Peitz1, Minwei Feng2, Hermann Ney1 •
RWTH Aachen University1, IBM2
1 Sep 2015
TL;DR: In this paper, the authors enhance the traditional confusion network system combination approach with an additional model trained by a neural network, which gives system combination the option to prefer other systems at different word positions even for the same sentence.
Abstract: In this paper, we enhance the traditional confusion network system combination approach with an additional model trained by a neural network. This work is motivated by the fact that the commonly used binary system voting models only assign each input system a global weight which is responsible for the global impact of each input system on all translations. This prevents individual systems with low system weights from having influence on the system combination output, although in some situations this could be helpful. Further, words which have only been seen by one or few systems rarely have a chance of being present in the combined output. We train a local system voting model by a neural network which is based on the words themselves and the combinatorial occurrences of the different system outputs. This gives system combination the option to prefer other systems at different word positions even for the same sentence.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve