TL;DR: The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.

...read moreread less

Abstract: We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

...read moreread less

4,102 citations

Book•

The semantics of definite and indefinite noun phrases

[...]

Irene Heim

1 Jan 1982

2,566 citations

Noun Phrase Accessibility and Universal Grammar

[...]

Edward L. Keenan, Bernard Comrie

1 Jan 2008

1,789 citations

Proceedings Article•10.3115/974235.974260•

A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

[...]

Kenneth Church¹•Institutions (1)

Bell Labs¹

9 Feb 1988

TL;DR: The authors used a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (pb probability of observing n following partsof speech).

...read moreread less

Abstract: A program that tags each word in an input sentence with the most likely part of speech has been written. The program uses a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (probability of observing part of speech i given n following parts of speech). Program performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct. >

...read moreread less

990 citations

Book Chapter•10.1002/9780470758335.CH15•