Stochastic attribute-value grammars
TL;DR: In this article, the authors define stochastic attribute-value grammars and give an algorithm for computing the maximum-likelihood estimate of their parameters, which is adapted from Della Pietra and Lafferty (1995).
read more
Abstract: Probabilistic analogues of regular and context-free grammars are well known in computational linguistics, and currently the subject of intensive research. To date, however, no satisfactory probabilistic analogue of attribute-value grammars has been proposed: previous attempts have failed to define an adequate parameter-estimation algorithm.In the present paper, I define stochastic attribute-value grammars and give an algorithm for computing the maximum-likelihood estimate of their parameters. The estimation algorithm is adapted from Della Pietra, Della Pietra, and Lafferty (1995). To estimate model parameters, it is necessary to compute the expectations of certain functions under random fields. In the application discussed by Della Pietra, Della Pietra, and Lafferty (representing English orthographic constraints), Gibbs sampling can be used to estimate the needed expectations. The fact that attribute-value grammars generate constrained languages makes Gibbs sampling inapplicable, but I show that sampling can be done using the more general Metropolis-Hastings algorithm.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling
Jenny Rose Finkel,Trond Grenager,Christopher D. Manning +2 more
- 25 Jun 2005
TL;DR: By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate non-local structure while preserving tractable inference.
A Syntax-based Statistical Translation Model
Kenji Yamada,Kevin Knight +1 more
- 06 Jul 2001
TL;DR: This model transforms a source-language parse tree into a target-language string by applying stochastic operations at each node, and produces word alignments that are better than those produced by IBM Model 5.
978
A comparison of algorithms for maximum entropy parameter estimation
Robert Malouf
- 31 Aug 2002
TL;DR: A number of algorithms for estimating the parameters of ME models are considered, including iterative scaling, gradient ascent, conjugate gradient, and variable metric methods.
Discriminative Reranking for Natural Language Parsing
Michael Collins,Terry Koo +1 more
TL;DR: This article used the boosting approach to rerank the output of an existing probabilistic parser using additional features of the tree as evidence, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account.
772
Head-Driven Phrase Structure Grammar
R.D. Levine,W.D. Meurers +1 more
- 01 Jan 2006
TL;DR: This work provides a tripartite overview of HPSG, a constraint-based model-theoretic framework for grammatical representation widely used for both theoretical research and computational implementations of natural language grammars.
459
References
•Posted Content
Inducing Features of Random Fields
TL;DR: The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated.
1.1K
•Book
Probabilistic constraint logic programming
Stefan Riezler
- 01 Jan 1999
TL;DR: An algorithm to estimate the parameters and to select the properties of log-linear models from incomplete data and an approach for searching for most probable analyses of the probabilistic constraint logic programming model are presented.
Stochastic HPSG
Chris Brew
- 27 Mar 1995
TL;DR: In this paper, a probabilistic interpretation for typed feature structures was proposed, which is similar to those used by Pollard and Sag, but without a treatment of reentrant feature structures.
Parameter estimation for constrained context-free language models
Kevin E. Mark,Michael I. Miller,Ulf Grenander,Steve Abney +3 more
- 23 Feb 1992
TL;DR: A new language model incorporating both N-gram and context-free ideas is proposed, specified by a stochastic context- free prior distribution with N- gram frequency constraints, which is a Markov random field.
24
Image Analysis, Random Fields and Dynamic Monte Carlo Methods
Gerhard Winkler
- 01 Jan 1995
TL;DR: The book is mainly concerned with the mathematical foundations of Bayesian image analysis and its algorithms, which amounts to the study of Markov random fields and dynamic Monte Carlo algorithms like sampling, simulated annealing and stochastic gradient algorithms.
Related Papers (5)
Christopher D. Manning,Hinrich Schütze +1 more
- 28 May 1999
Ivan A. Sag,Carl Jesse Pollard +1 more
- 01 Jan 1994