Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

doi:10.3115/1620754.1620800

Open AccessProceedings Article10.3115/1620754.1620800

Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

Mark Johnson, +1 more

- 31 May 2009

- pp 317-325

155

TL;DR: This paper investigates some of the choices that arise in formulating adaptor grammars and associated inference procedures, and shows that they can have a dramatic impact on performance in an unsupervised word segmentation task.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article

Incorporating Lexical Priors into Topic Models

Jagadeesh Jagarlamudi, +2 more

- 23 Apr 2012

TL;DR: This work proposes a simple and effective way to guide topic models to learn topics of specific interest to a user by providing sets of seed words that a user believes are representative of the underlying topics in a corpus.

...read moreread less

339

•Proceedings Article•10.3115/1687878.1687894

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

Daichi Mochihashi, +2 more

- 02 Aug 2009

TL;DR: A new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference is proposed and confirmed that it significantly outperforms previous reported results in both phonetic transcripts and standard datasets for Chinese and Japaneseword segmentation.

...read moreread less

307

•Proceedings Article

Painless Unsupervised Learning with Features

Taylor Berg-Kirkpatrick, +3 more

- 02 Jun 2010

TL;DR: This work shows how features can easily be added to standard generative models for unsupervised learning, without requiring complex new training methods, and applies this technique to part-of-speech induction, grammar induction, word alignment, and word segmentation.

...read moreread less

271

•Proceedings Article•10.18653/V1/2020.COLING-MAIN.313

Decolonising Speech and Language Technology

Steven Bird

- 01 Dec 2020

TL;DR: This paper reviews colonising discourses in speech and language technology, and suggests new ways of working with Indigenous communities, and seeks to open a discussion of a postcolonial approach to computational methods for supporting language vitality.

...read moreread less

185

•Journal Article•10.1080/01691864.2016.1164622

Symbol emergence in robotics: a survey

Tadahiro Taniguchi, +5 more

- 11 Apr 2016

- Advanced Robotics

TL;DR: In this article, the authors introduce a field of research called symbol emergence in robotics (SER), which represents a constructive approach towards a symbol emergence system, where embodied cognition and social interaction of participants gradually alter a symbol system in a constructive manner.

...read moreread less

169

...

Expand

References

•Journal Article•10.1207/S15516709COG1402_1

Finding Structure in Time

Jeffrey L. Elman

- 01 Mar 1990

- Cognitive Science

TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.

...read moreread less

11.8K

•Book

Information Theory, Inference and Learning Algorithms

David J. C. MacKay

- 06 Oct 2003

TL;DR: A fun and exciting textbook on the mathematics underpinning the most dynamic areas of modern science and engineering.

...read moreread less

9.3K

•Book

Monte Carlo Statistical Methods

Christian P. Robert, +1 more

- 01 Jan 1999

TL;DR: This new edition contains five completely new chapters covering new developments and has sold 4300 copies worldwide of the first edition (1999).

...read moreread less

7.1K

Journal Article•10.1108/03684920410534506

Information Theory, Inference, and Learning Algorithms

Alex M. Andrew

- 01 Aug 2004

- Kybernetes

4.4K

•Journal Article•10.1198/016214506000000302

Hierarchical Dirichlet Processes

Yee Whye Teh, +3 more

- 01 Dec 2006

- Journal of the American Statistical Asso...

TL;DR: This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.

...read moreread less

4.2K

...

Expand

Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

Chat with Paper

AI Agents for this Paper

Citations

Incorporating Lexical Priors into Topic Models

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

Painless Unsupervised Learning with Features

Decolonising Speech and Language Technology

Symbol emergence in robotics: a survey

References

Finding Structure in Time

Information Theory, Inference and Learning Algorithms

Monte Carlo Statistical Methods

Information Theory, Inference, and Learning Algorithms

Hierarchical Dirichlet Processes

Related Papers (5)

A Bayesian framework for word segmentation: Exploring the effects of context

An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

Distributional regularity and phonotactic constraints are useful for segmentation

Statistical Learning by 8-Month-Old Infants