Learning Syntax by Automata Induction
Robert C. Berwick,Sam Pilato +1 more
TL;DR: This paper proposes an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences and shows that just where linguistic theories admit to highly irregular subportions, they can apply efficient automata-theoretic learning algorithms.
read more
Abstract: In this paper we propose an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences. We use these results to show how inductive inference methods may be applied to learn substantial, coherent subparts of at least one natural language – English – that are not susceptible to the kinds of learning envisioned in linguistic theory. As two concrete case studies, we show how to learn English auxiliary verb sequences (such as could be taking, will have been taking) and the sequences of articles and adjectives that appear before noun phrases (such as the very old big deer). Both systems can be acquired in a computationally feasible amount of time using either positive examples, or, in an incremental mode, with implicit negative examples (examples outside a finite corpus are considered to be negative examples). As far as we know, this is the first computer procedure that learns a full-scale range of noun subclasses and noun phrase structure. The generalizations and the time required for acquisition match our knowledge of child language acquisition for these two cases. More importantly, these results show that just where linguistic theories admit to highly irregular subportions, we can apply efficient automata-theoretic learning algorithms. Since the algorithm works only for fragments of language syntax, we do not believe that it suffices for all of language acquisition. Rather, we would claim that language acquisition is nonuniform and susceptible to a variety of acquisition strategiess this algorithm may be one these.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Identifying hierarchical structure in sequences: a linear-time algorithm
TL;DR: SEQUITUR as mentioned in this paper is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively.
A scalable comparison-shopping agent for the World-Wide Web
Robert B. Doorenbos,Oren Etzioni,Daniel S. Weld +2 more
- 08 Feb 1997
TL;DR: ShopBot, a fully-implemented, domainindependent comparison-shopping agent that relies on a combination of heuristic search, pattern matching, and inductive learning techniques, enables users to both find superior prices and substantially reduce Web shopping time.
•Dissertation
Unsupervised language acquisition
Carl G. De Marcken,Robert C. Berwick +1 more
- 01 Jan 1996
TL;DR: In this article, a computational theory of unsupervised language acquisition is presented, which is based heavily on concepts borrowed from machine learning and statistical estimation, and can be used for data compression, speech recognition, machine translation, information retrieval, and other tasks that rely on either structural or stochastic descriptions of language.
202
The growth of language: universal Grammar, experience, and principles of computation
TL;DR: By considering the place of language in human biology and evolution, this work proposes an approach that integrates principles from Universal Grammar and constraints from other domains of cognition, and outlines some initial results of this approach as well as challenges for future research.
190
Software agents: completing patterns and constructing user interfaces
TL;DR: In this paper, an interactive note-taking system for pen-based computers with two distinctive features is described, namely, it actively predicts what the user is going to write and constructs a custom, button-box user interface on request.
References
Language identification in the limit
TL;DR: It was found that theclass of context-sensitive languages is learnable from an informant, but that not even the class of regular languages is learningable from a text.
3.8K
Language Learnability and Language Development
Clifton Pye,Steven Pinker +1 more
TL;DR: Language learnability and language devlopment revisited the acquisition theory - assumptions and postulates phrase structure rules phrase stucture rules - developmental considerations inflection complementation and control auxiliaries lexical entries and lexical rules.
2.5K
•Book
Language Acquisition: The State of the Art
Eric Wanner,Lila R. Gleitman +1 more
- 16 Sep 2009
TL;DR: This book discusses language acquisition through the lens of grammar, semantics, and ontology, and investigates the role of universals in the acquisition of gerunds and its role in lexical and syntactic development.
1.9K
•Book
Formal Principles of Language Acquisition
Kenneth Wexler,Peter W. Culicover +1 more
- 01 Jan 1980
TL;DR: The authors of this book have developed a rigorous and unified theory that opens the study of language learnability to discoveries about the mechanisms of language acquisition in human beings and has important implications for linguistic theory, child language research, and the philosophy of language.
1.4K
Complexity of automaton identification from given data
TL;DR: The question of whether there is an automaton with n states which agrees with a finite set D of data is shown to be NP-complete, although identification-in-the-limit of finite automata is possible in polynomial time as a function of the size of D.
919