Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars
Mark Johnson,Sharon Goldwater +1 more
- 31 May 2009
- pp 317-325
TL;DR: This paper investigates some of the choices that arise in formulating adaptor grammars and associated inference procedures, and shows that they can have a dramatic impact on performance in an unsupervised word segmentation task.
read more
Abstract: One of the reasons nonparametric Bayesian inference is attracting attention in computational linguistics is because it provides a principled way of learning the units of generalization together with their probabilities. Adaptor grammars are a framework for defining a variety of hierarchical nonparametric Bayesian models. This paper investigates some of the choices that arise in formulating adaptor grammars and associated inference procedures, and shows that they can have a dramatic impact on performance in an unsupervised word segmentation task. With appropriate adaptor grammars and inference procedures we achieve an 87% word token f-score on the standard Brent version of the Bernstein-Ratner corpus, which is an error reduction of over 35% over the best previously reported results for this corpus.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Incorporating Lexical Priors into Topic Models
Jagadeesh Jagarlamudi,Hal Daumé,Raghavendra Udupa +2 more
- 23 Apr 2012
TL;DR: This work proposes a simple and effective way to guide topic models to learn topics of specific interest to a user by providing sets of seed words that a user believes are representative of the underlying topics in a corpus.
339
Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling
Daichi Mochihashi,Takeshi Yamada,Naonori Ueda +2 more
- 02 Aug 2009
TL;DR: A new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference is proposed and confirmed that it significantly outperforms previous reported results in both phonetic transcripts and standard datasets for Chinese and Japaneseword segmentation.
•Proceedings Article
Painless Unsupervised Learning with Features
Taylor Berg-Kirkpatrick,Alexandre Bouchard-Côté,John DeNero,Dan Klein +3 more
- 02 Jun 2010
TL;DR: This work shows how features can easily be added to standard generative models for unsupervised learning, without requiring complex new training methods, and applies this technique to part-of-speech induction, grammar induction, word alignment, and word segmentation.
Decolonising Speech and Language Technology
Steven Bird
- 01 Dec 2020
TL;DR: This paper reviews colonising discourses in speech and language technology, and suggests new ways of working with Indigenous communities, and seeks to open a discussion of a postcolonial approach to computational methods for supporting language vitality.
Symbol emergence in robotics: a survey
TL;DR: In this article, the authors introduce a field of research called symbol emergence in robotics (SER), which represents a constructive approach towards a symbol emergence system, where embodied cognition and social interaction of participants gradually alter a symbol system in a constructive manner.
169
References
Finding Structure in Time
TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.
11.8K
•Book
Information Theory, Inference and Learning Algorithms
David J. C. MacKay
- 06 Oct 2003
TL;DR: A fun and exciting textbook on the mathematics underpinning the most dynamic areas of modern science and engineering.
•Book
Monte Carlo Statistical Methods
Christian P. Robert,George Casella +1 more
- 01 Jan 1999
TL;DR: This new edition contains five completely new chapters covering new developments and has sold 4300 copies worldwide of the first edition (1999).
Hierarchical Dirichlet Processes
TL;DR: This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.