Transliteration as Constrained Optimization
Dan Goldwasser,Dan Roth +1 more
- 25 Oct 2008
- pp 353-362
TL;DR: It is shown that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings.
read more
Abstract: This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50% improvement in translating English NEs to Hebrew.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Report of NEWS 2009 Machine Transliteration Shared Task
Haizhou Li,Alaganandam Kumaran,Vladimir Pervouchine,Min Zhang +3 more
- 07 Aug 2009
TL;DR: This report documents the details of the Machine Transliteration Shared Task conducted as a part of the Named Entities Workshop (NEWS), an ACL-IJCNLP 2009 workshop, and believes that the shared task has successfully achieved the following objectives.
•Proceedings Article
Transliteration Generation and Mining with Limited Training Resources
Sittichai Jiampojamarn,Kenneth Dwyer,Shane Bergsma,Aditya Bhargava,Qing Dou,Mi-Young Kim,Grzegorz Kondrak +6 more
- 16 Jul 2010
TL;DR: DirecTL+ is presented: an online discriminative sequence prediction model based on many-to-many alignments, which is further augmented by the incorporation of joint n-gram features, which shows improvement over the results achieved by DirecTL in 2009.
50
Learning Phoneme Mappings for Transliteration without Parallel Data
Sujith Ravi,Kevin Knight +1 more
- 31 May 2009
TL;DR: A method for performing machine transliteration without any parallel resources is presented and it is shown that it is possible to learn cross-language phoneme mapping tables using only monolingual resources.
•Proceedings Article
Report of NEWS 2010 Transliteration Generation Shared Task
Haizhou Li,Alaganandam Kumaran,Min Zhang,Vladimir Pervouchine +3 more
- 16 Jul 2010
TL;DR: The Transliteration Generation Shared Task conducted as a part of the Named Entities Workshop (NEWS 2010), an ACL 2010 workshop has successfully achieved its objective by providing a common benchmarking platform for the research community to evaluate the state-of-the-art technologies that benefit the future research and development.
34
•Proceedings Article
Improving the Multilingual User Experience of Wikipedia Using Cross-Language Name Search
Raghavendra Udupa,Mitesh M. Khapra +1 more
- 02 Jun 2010
TL;DR: A novel cross-language name search algorithm is proposed and employed for searching English Wikipedia articles in a diverse set of languages including Hebrew, Hindi, Russian, Kannada, Bangla and Tamil and shows that the multilingual experience of users is significantly improved by this approach.
31
References
•Proceedings Article
On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes
Andrew Y. Ng,Michael I. Jordan +1 more
- 03 Jan 2001
TL;DR: It is shown, contrary to a widely-held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is increased, one in which each algorithm does better.
•Proceedings Article
A Linear Programming Formulation for Global Inference in Natural Language Tasks
Dan Roth,Wen-tau Yih +1 more
- 01 Jan 2004
TL;DR: This work develops a linear programing formulation for this problem and evaluates it in the context of simultaneously learning named entities and relations to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the "human-like" quality of the inferences.
•Posted Content
A Winnow-Based Approach to Context-Sensitive Spelling Correction
Andrew R. Golding,Dan Roth +1 more
TL;DR: The authors presented an algorithm combining variants of Winnow and weighted-majority voting, and applied it to a problem in the aforementioned class: context-sensitive spelling correction, which is the task of fixing spelling errors that happen to result in valid words, such as substituting "to" for "too", "casual" for 'causal", etc.
269
•Proceedings Article
Learning to resolve natural language ambiguities: a unified approach
Dan Roth
- 01 Jul 1998
TL;DR: In this paper, a sparse network of linear separators is proposed for natural language disambiguation, which is based on the Winnow learning algorithm and is shown to perform well in a variety of ambiguity resolution problems.
•Proceedings Article
Name Translation in Statistical Machine Translation - Learning When to Transliterate
Ulf Hermjakob,Kevin Knight,Hal Daumé +2 more
- 01 Jun 2008
TL;DR: A method to transliterate names in the framework of end-to-end statistical machine translation for Arabic to English MT and achieves better name translation accuracy than 3 out of 4 professional translators.
127