Learning Syntax by Automata Induction

doi:10.1023/A:1022860810097

Open AccessJournal Article10.1023/A:1022860810097

Learning Syntax by Automata Induction

Robert C. Berwick, +1 more

- 07 Mar 1987

- Machine Learning

- Vol. 2, Iss: 1, pp 9-38

63

TL;DR: This paper proposes an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences and shows that just where linguistic theories admit to highly irregular subportions, they can apply efficient automata-theoretic learning algorithms.

Abstract: In this paper we propose an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences. We use these results to show how inductive inference methods may be applied to learn substantial, coherent subparts of at least one natural language – English – that are not susceptible to the kinds of learning envisioned in linguistic theory. As two concrete case studies, we show how to learn English auxiliary verb sequences (such as could be taking, will have been taking) and the sequences of articles and adjectives that appear before noun phrases (such as the very old big deer). Both systems can be acquired in a computationally feasible amount of time using either positive examples, or, in an incremental mode, with implicit negative examples (examples outside a finite corpus are considered to be negative examples). As far as we know, this is the first computer procedure that learns a full-scale range of noun subclasses and noun phrase structure. The generalizations and the time required for acquisition match our knowledge of child language acquisition for these two cases. More importantly, these results show that just where linguistic theories admit to highly irregular subportions, we can apply efficient automata-theoretic learning algorithms. Since the algorithm works only for fragments of language syntax, we do not believe that it suffices for all of language acquisition. Rather, we would claim that language acquisition is nonuniform and susceptible to a variety of acquisition strategiess this algorithm may be one these.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1613/JAIR.374

Identifying hierarchical structure in sequences: a linear-time algorithm

Craig G. Nevill-Manning, +1 more

- 01 Jul 1997

- Journal of Artificial Intelligence Resea...

TL;DR: SEQUITUR as mentioned in this paper is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively.

...read moreread less

697

Proceedings Article•10.1145/267658.267666

A scalable comparison-shopping agent for the World-Wide Web

Robert B. Doorenbos, +2 more

- 08 Feb 1997

TL;DR: ShopBot, a fully-implemented, domainindependent comparison-shopping agent that relies on a combination of heuristic search, pattern matching, and inductive learning techniques, enables users to both find superior prices and substantially reduce Web shopping time.

...read moreread less

635

•Dissertation

Unsupervised language acquisition

Carl G. De Marcken, +1 more

- 01 Jan 1996

TL;DR: In this article, a computational theory of unsupervised language acquisition is presented, which is based heavily on concepts borrowed from machine learning and statistical estimation, and can be used for data compression, speech recognition, machine translation, information retrieval, and other tasks that rely on either structural or stochastic descriptions of language.

...read moreread less

202

Journal Article•10.1016/J.NEUBIOREV.2016.12.023

The growth of language: universal Grammar, experience, and principles of computation

Charles Yang, +4 more

- 01 Oct 2017

- Neuroscience & Biobehavioral Reviews

TL;DR: By considering the place of language in human biology and evolution, this work proposes an approach that integrates principles from Universal Grammar and constraints from other domains of cognition, and outlines some initial results of this approach as well as challenges for future research.

...read moreread less

190

•Journal Article•10.1613/JAIR.25

Software agents: completing patterns and constructing user interfaces

Jeffrey C. Schlimmer, +1 more

- 01 Aug 1993

- Journal of Artificial Intelligence Resea...

TL;DR: In this paper, an interactive note-taking system for pen-based computers with two distinctive features is described, namely, it actively predicts what the user is going to write and constructs a custom, button-box user interface on request.

...read moreread less

86

...

Expand

References

•Journal Article•10.1016/S0019-9958(67)91165-5

Language identification in the limit

E. Mark Gold

- 01 May 1967

- Information & Computation

TL;DR: It was found that theclass of context-sensitive languages is learnable from an informant, but that not even the class of regular languages is learningable from a text.

...read moreread less

3.8K

Journal Article•10.2307/414499

Language Learnability and Language Development

Clifton Pye, +1 more

- 01 Dec 1985

- Language

TL;DR: Language learnability and language devlopment revisited the acquisition theory - assumptions and postulates phrase structure rules phrase stucture rules - developmental considerations inflection complementation and control auxiliaries lexical entries and lexical rules.

...read moreread less

2.5K

•Book

Language Acquisition: The State of the Art

Eric Wanner, +1 more

- 16 Sep 2009

TL;DR: This book discusses language acquisition through the lens of grammar, semantics, and ontology, and investigates the role of universals in the acquisition of gerunds and its role in lexical and syntactic development.

...read moreread less

1.9K

•Book

Formal Principles of Language Acquisition

Kenneth Wexler, +1 more

- 01 Jan 1980

TL;DR: The authors of this book have developed a rigorous and unified theory that opens the study of language learnability to discoveries about the mechanisms of language acquisition in human beings and has important implications for linguistic theory, child language research, and the philosophy of language.

...read moreread less

1.4K

•Journal Article•10.1016/S0019-9958(78)90562-4

Complexity of automaton identification from given data

E. Mark Gold

- 01 Jun 1978

- Information & Computation

TL;DR: The question of whether there is an automaton with n states which agrees with a finite set D of data is shown to be NP-complete, although identification-in-the-limit of finite automata is possible in polynomial time as a function of the size of D.

...read moreread less

919