TL;DR: In this paper, a method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy, and it is shown that these two notions of learnability are equivalent.
Abstract: This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.
A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
TL;DR: Questions such as cross-parameter dependencies, determinism, subsets, and incremental versus all-at-once learning are raised and discussed in the article.
TL;DR: Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 1990 as mentioned in this paper, Boston, Massachusetts, United States of America, USA
Abstract: Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 1990.
TL;DR: It is proved that no such approach to evading the CAP can work, and a new, fast algorithm is given for learning unions of half spaces in fixed dimension, suggesting a generalization of this approach which naively would avoid a credit assignment problem and learn in time polynomial in dimension.
TL;DR: This paper used children (7-10 years old) to explore the acquisition of a miniature artificial language with rules, patterns, subpatterns, and exceptions that are quite like those found in inflectional systems of natural languages.
TL;DR: In this article, the authors discuss the relationship between ease/difficulty in learning particular words and some issues in the teaching of vocabulary, and argue that word learnability can serve as a guideline to the following: the selection of words to be taught; their presentation (quantity, grouping, language of presentation, isolation/ context issue); the facilitation of long-term memorization (meaningful tasks, mnemonic techniques, rote learning, reactivation); the development of strategies for self-learning; and the assessment of vocabulary knowledge.
Abstract: This paper discusses the relationship between ease/difficulty in learning particular words and some issues in the teaching of vocabulary.
Some factors that interfere with learning a word are claimed to be the following: similarity of form between the word and other words (embrace/embarrass, price/prize); morphological similarity between it and other words (industrial/industrious, respectable/respective); deceptive morphological structure (infallible); different syntactic patterning in L1; differences in the classification of experience between L1 and L2 (one-to-many correspondence, partial overlap in meaning, metaphorical extension, lexical voids, multiplicity of meaning); abstractness; specificity; negative value; connotations nonexistent in L1; differences in the pragmatic meaning of near synonyms and of L1 translation equivalents; the learning burden of synonyms; the apparent rulelessness of collocations.
It is argued that word learnability (ease/difficulty in learning a particular word) can serve as a guideline to the following: the selection of words to be taught; their presentation (quantity, grouping, language of presentation, isolation/ context issue); the facilitation of long-term memorization (meaningful tasks, mnemonic techniques, rote learning, reactivation); the development of strategies for self-learning; and the assessment of vocabulary knowledge.
TL;DR: This paper derives target-dependent upper bounds and worst-case upper bounds on the sample size required by the MDL algorithm to learn stochastic rules with given accuracy and confidence.
Abstract: This paper proposes a learning criterion for stochastic rules. This criterion is developed by extending Valiant's PAC (Probably Approximately Correct) learning model, which is a learning criterion for deterministic rules. Stochastic rules here refer to those which probabilistically asign a number of classes, lYr, to each attribute vector X. The proposed criterion is based on the idea that learning stochastic rules may be regarded as probably approximately correct identification of conditional probability distributions over classes for given input attribute vectors. An algorithm (an MDL algorithm) based on the MDL (Minimum Description Length) principle is used for learning stochastic rules. Specifically, for stochastic rules with finite partitioning (each of which is specified by a finite number of disjoint cells of the domain and a probability parameter vector associated with them), this paper derives target-dependent upper bounds and worst-case upper bounds on the sample size required by the MDL algorithm to learn stochastic rules with given accuracy and confidence. Based on these sample size bounds, this paper proves polynomial-sample-size learnability of stochastic decision lists (which are newly proposed in this paper as a stochastic analogue of Rivest's decision lists) with at most k literals (k is fixed) in each decision, and polynomial-sample-size learnability of stochastic decision trees (a stochastic analogue of decision trees) with at most k depth. Sufficient conditions for polynomial-sample-size learnability and polynomial-time learnability of any classes of stochastic rules with finite partitioning are also derived.
TL;DR: In this paper, an account of the acquisition of subject-auxiliary inversion is provided, and it is shown that principles derived from linguistic and learnability theory to solve the logical problem of acquisition are sufficient to explain the developmental sequence associated with this construction.
Abstract: An account of the acquisition of subject-auxiliary inversion is provided. We show that principles derived from linguistic and learnability theory to solve the logical problem of acquisition are sufficient to explain the developmental sequence associated with this construction. The fact that some children learn features of this construction at the same time, whereas others learn the same features at different points in development, is shown to be a problem for the Maturation Hypothesis of Borer and Wexler (1987).
TL;DR: An algorithm is given that, for any simple deterministic language L, outputs a grammar G in 2-standard form, such that L = L(G), using membership queries and extended equivalence queries.
Abstract: This paper is concerned with the problem of learning simple deterministic languages. The algorithm described in this paper is based on the theory of model inference given by Shapiro. In our setting, however, nonterminal membership queries, except for the start symbol, are not permitted. Extended equivalence queries are used instead. Nonterminals that are necessary for a correct grammar and their intended models are introduced automatically. We give an algorithm that, for any simple deterministic language L, outputs a grammar G in 2-standard form, such that L e L(G), using membership queries and extended equivalence queries. We also show that the algorithm runs in time polynomial in the length of the longest counterexample and the number of nonterminals in a minimal grammar for L.
TL;DR: In this article, a case study of the theory of transformational grammars is presented, where the authors focus on the learnability, restrictiveness, and the evaluation metric.
Abstract: 1 Introduction.- 2 Some Issues in the Theory of Transformations.- 3 A Restrictive Theory of Transformational Grammar.- 4 Filters and Control.- 5 Restricting the Theory of Transformations: a case study.- 6 Learnability, Restrictiveness, and the Evaluation Metric.- 7 On a Lexical Parameter in the Government-Binding Theory.- 8 Core Grammar, Case Theory, and Markedness.- 9 On Certain Substitutes for Negative Data.- 10 On the Nature of Proper Government.- Index of Names.- Index of Subjects.
TL;DR: This paper hypothesizes that Universal Grammar itself significantly determines certain aspects of language processing as well as language acquisition, and predicts that this differential organization of parsing is a characteristic of processing in very early stages of language acquisition.
Abstract: Up until now, studies of natural language processing and acquisition in relation to Universal Grammar Chomsky, 1982 and 1987) have been conducted independently to a large degree. When they have been related, e.g. in studies of relations between language learnability and parsability, these studies have mainly argued that learnability and parsability put functional constraints on Universal Grammar. In contrast, in this paper, we will pursue a program of study of the relations between language processing and acquisition which hypothesizes that Universal Grammar itself significantly determines certain aspects of language processing as well as language acquisition. In particular, we will hypothesize that parameter setting in UG has as one deductive consequence, a systematically different organization of parsing across language types. Since we consider that parameter-setting for UG occurs very early, we predict that this differential organization of parsing is a characteristic of processing in very early stages of language acquisition, as well as in the adult
TL;DR: The question of what has to be learned by the child and the implications for restrictiveness proposals are explored, and what I take to be the three major areas of concern are discussed.
Abstract: At least in the last few decades considerations of learnability have played a guiding role in much linguistic research In particular, there is fairly general agreement that restrictiveness is important There is substantial controversy, however, over exactly what ought to be restricted and over the nature of the appropriate restrictions I will explore the question of what has to be learned by the child and the implications for restrictiveness proposals, and I will discuss what I take to be the three major areas of concern: (1) properties of the evaluation metric; (2) restrictions on the class of grammars, particularly as they relate to the evaluation metric; and (3) restrictions limiting the type and amount of data required by the child
TL;DR: Satake's intended novel contribution to this older literature is stated clearly at the outset: to build an empiricist model of acquisition that is cognitively faithful, one where the structure of language is "out there" in the world and formed by inductive generalization, special properties of parental input, the order of examples, and the like rather than "in there"--the child's head.
Abstract: In a recent survey of early language acquisition, Gleitman, Gleitman, Landau, and Wanner (1988) cite Leonard Bloomfield (1933, p 29) as remarking that "language learning is doubtless the greatest intellectual feat any one of us is ever required to perform" Given this, it is equally no small feat to attempt to build a computer model that does the same thing Satake's book is one of but a handful of attempts in the computational linguistics tradition to take up this challengemsomewhat surprising given the vast range of linguistic and psychological literature on the subject Perhaps it is because linguists and psychologists can try to digest just one piece of the acquisition puzzle, while a computational model must typically try to gobble a major chunk of language acquisition whole, or risk being called a mere toy In this light, Satake should be congratulated for trying to present, in one brief volume, a computational model that attempts to handle facts about morpheme acquisition and intonation; varying word order across languages; verb subcategorization; and classic rule overgeneralization, while at the same time at least paying some attention to what psychologists know about child language One then obviously runs the risk of stretching too thin, and in fact the volume under review runs far too short in large type Readers looking for answers to these rich subjects mentioned just above will come away disappointed by a sketch that ultimately can only approximate what computer modeling did in this area more than ten years ago (work by Anderson 1977; Selfridge 1981; and Berwick 1979, 1985) The book exercises the model with a very limited range of sample sentences--just nine examples, with no recursion More unfortunately, given the emphasis on free word order, no Japanese examples are included The first quarter of the book is devoted to a rather thin outline of some of the basic psychological results on input available to the child and learnability theory, while the remainder is devoted to the three components of (sub)category generalization, a case analysis of the system working on the examples and a short study of over-regularization, and the use of teacher correction in a so-called "production mode" to repair mistakes This last point is quite important, for Satake's intended novel contribution to this older literature is stated clearly at the outset: to build an empiricist model of acquisition that is cognitively faithful--that is, one where the structure of language is "out there" in the world and formed by inductive generalization, special properties of parental input (motherese), the order of examples (including negative examples), and the like rather than "in there"--the child's head Satake means this of course as the polar opposite of "innate" acquisition procedures, which assume a richly structured knowledge of language to begin with (Satake labels these as "passive" acquisition models
TL;DR: This work tries to demonstrate that connectionist models can be used to explore systematically the complex interaction between learning and representation, as it is demonstrated through the analysis of several large networks.
Abstract: s Connectionist models provide a promising alternative to the traditional computational approach that has for several decades dominated cognitive science and artificial intelligence, although the nature of connectionist models and their relation to symbol processing remains controversial. Connectionist models can be characterized by three general computational features: distinct layers of interconnected units, recursive rules for updating the strengths of the connections during learning, and "simple" homogeneous computing elements. Using just these three features one can construct surprisingly elegant and powerful models of memory, perception, motor control, categorization, and reasoning. What makes the connectionist approach unique is not its variety of representational possibilities (including "distributed representations") or its departure from explicit rule-based models, or even its preoccupation with the brain metaphor. Rather, it is that connectionist models can be used to explore systematically the complex interaction between learning and representation, as we try to demonstrate through the analysis of several large networks.
TL;DR: The authors showed that children do receive negative evidence about the ungrammaticality of their utterances in the form of recasts, expansions, and repetitions, and that such feedback is equivalent to negative evidence.
Abstract: Bohannon and Stanowicz (1988) have claimed that contrary to popular belief, children do receive negative evidence about the ungrammaticality of their utterances in the form of recasts, expansions, and repetitions. Bohannon and Stanowicz argue that given such negative evidence, learnability theory shows that natural languages can be learned and that there is no need to postulate innate knowledge based on such arguments. The present article establishes what exactly the claims oflearnability theory really entail, and demonstrates that because Bohannon and Stanowicz have shown only partial negative evidence, the results have no bearing on existing formal proofs of learnability; also, the learnability proofs proposed by Gold (1967) actually tell us very little about what may or may not be innate. Finally, it is pointed out that there are cases of language acquisition in which feedback does not appear to occur. In their recent article, "The Issue of Negative Evidence: Adult Responses to Children's Language Errors," Bohannon and Stanowicz (1988) have attempted to discredit the widely accepted belief that children do not receive negative evidence concerning the ungrammaticality of their utterances. They characterize this claim as being the prime motivation of nativist theory for the postulation of innate knowledge of language in children (e.g., Chomsky, 1972; Pinker, 1984; Wexler & Cullicover, 1980). Demonstrating the availability of negative evidence for children would therefore make it unnecessary to postulate many of the innate constraints. Bohannon and Stanowicz (1988) have also shown that in a study of adult interactions with 2-year-olds, both parents and other adults reacted differentially to grammatical and ungrammatical utterances from children. In particular, 9(~o of the exact repetitions followed grammatical utterances and 70% of the recasts and expansions followed ungrammatical utterances. Overall, some 34% of the children's syntactic errors were followed by some form of implicit feedback of this type. It is claimed that such feedback is equivalent, in some sense, to negative evidence. Bohannon and Stanowicz conclude that in order to justify the nativist assumptions, such theorists must "replicate the 'Pharaoh's experiment' of a child isolated from other language users" (p. 688) and demonstrated that language acquisition can still occur in the absence of feedback. In the present article I do not attempt to evaluate the empirical validity of the data presented by Bohannon and Stanowicz (1988), nor do I intend to contest the claim that such feedback might be instrumental in facilitating acquisition (of. Nelson, Denninger, Bonvillian, Kaplan, & Baker 1984). Rather, I focus quite specifically on the learnability issues addressed in their article, inasmuch as they appear to provide the theoretical back
TL;DR: A connectionist model that learns a variety of stress patterns without the incorporation, as processing primitives, of theoretical linguistic constructs such as wdriail foot and parameter is discussed.
Abstract: Metrical phonology is a relatively successful linguistic theory that attempts to explain stress systems in language. This paper discusses a connectionist model that learns a variety of stress patterns without the incorporation, as processing primitives, of theoretical linguistic constructs such as wdriail foot and parameter. An analysis of the learnability of various stress patterns is developed, hnsed on learning results and connection weights developed for different stress systems. This analysis predicts that certain aspects of stress systems will be more difficult to learn, at least within the computational framework adopted. The model demonstrates an ability to generalize, and its encoding of know lodge of stress patterns indicates systematicity. with symmetries among stress patterns bring rHIretrd in the encoded knowledge. f-
TL;DR: A computational model was used to improve the learnability of an Air Force document, doubling recall and greatly improving recruits' mental representation of the content, suggesting that the computational model can be used to improvements of Air Force tests.
Abstract: : A computational model was used to improve the learnability of an Air Force document, doubling recall and greatly improving recruits' mental representation of the content. Kintsch's computer model of reading was applied to a 1000 word Air Force text on the Air Force's role in Vietnam War. Principles of the model were used to identify 40 text locations where recruits would have to make inferences if they were to have a coherent mental representation of the text. Each location was then repaired, and the repaired text was then tested for learnability against the original text in two experiments. In experiment 1, free recall was doubled for the repaired text. In the second experiment, 120 recruits' 66-part mental representations for 12 important text concepts were measured, and compared with the mental representations of the text's author, and of 7 independent subject matter experts. The author and the experts' mental representations correlated about .80. For recruits who read the repaired text, their mental representations correlated with the author and experts about .55 = N < 0.05. But recruits who read the original text correlated with the author and experts only about .10. These results suggest that the computational model can be used to improve the learnability of Air Force tests. Individual differences tests of inferencing ability were developed.
TL;DR: In this paper, it was pointed out that there are cases of language acquisition in which feedback does not appear to occur, and also that the learnability proofs proposed by Gold (1967) actually tell us very little about what may or may not be innate.
Abstract: Bohannon and Stanowicz (1988) have claimed that contrary to popular belief, children do receive negative evidence about the ungrammaticality of their utterances in the form of recasts, expansions, and repetitions. Bohannon and Stanowicz argue that given such negative evidence, learnability theory shows that natural languages can be learned and that there is no need to postulate innate knowledge based on such arguments. The present article establishes what exactly the claims of learnability theory really entail, and demonstrates that because Bohannon and Stanowicz have shown only partial negative evidence, the results have no bearing on existing formal proofs of learnability; also, the learnability proofs proposed by Gold (1967) actually tell us very little about what may or may not be innate. Finally, it is pointed out that there are cases of language acquisition in which feedback does not appear to occur.
TL;DR: In this paper, three constraints are discussed that restrict the class of possible parameters and the way they are fixed in development, and empirical results on the acquisition of subject-verb agreement, verb placement, empty subjects, and negation in German child language are presented.
Abstract: In the first part of this article, it is argued that in order to improve the parameter model as a theory of acquisition it has to be constrained in several ways. Three constraints are discussed that restrict the class of possible parameters and the way they are fixed in development. In the second part, empirical results on the acquisition of subject-verb agreement, verb placement, empty subjects, and negation in German child language are presented. I suggest a grammatical analysis for these data (in terms of the Split-Infl Hypothesis) that allows us to maintain the learnability constraints from the beginning.
TL;DR: In this article, the authors discuss the relevant and irrelevant aspects of both the original Gold proof and more modern attempts at learnability and show that uniqueness, a concept central to all modern formal models, is also adaptable to account for the negative evidence available in the child's input language.
Abstract: Gordon (1990), in his commentary on Bohannon and Stanowicz (1988), argued that (a) the originalGold (1967) learnability proof bears little relevance for innateness of language, (b) the Bohannonand Stanowicz results do not justify abandoning innate restraints on language learning, and (c) theremay be cases in which such feedback is unavailable. In this reply, the relevant and irrelevant aspectsof both the original Gold proof and more modern attempts at learnability are discussed. Uniqueness,a concept central to all modern formal models, is also adaptable to account for the negative evidenceavailable in the child's input language. Rates of feedback found in Bohannon and Stanowicz areshown to be sufficient to spur learning in many species, including concept formation tasks in hu-mans, and anecdotal counterevidence against the universality of negative evidence is discounted. Itis suggested that using innate factors as a "default" explanation is a dangerous and counterproductivescientific endeavor.
TL;DR: The authors showed that children acquire ergative and accusative morphological systems equally easily and support a distributional learning procedure for children to distinguish between the subjects of transitive and intransitive verbs.
Abstract: Ergative languages have challenged the ingenuity of linguists for more than a century. This article explores learnability problems associated with the acquisition of ergative languages. Traditionally, an ergative language is one which treats the subjects of intransitive verbs in the same way as the objects of transitive verbs. Languages may have rules which operate on a morphologically or syntactically ergative basis, but all languages are syntactically accusative to some extent. Both types of ergativity raise problems for language-acquisition theory. Children acquiring ergative morphologies must learn to distinguish between the subjects of transitive and intransitive verbs. Acquisition data suggest that children acquire ergative and accusative morphological systems equally easily. This finding supports a distributional learning procedure. Learnability considerations rule out the existence of syntactically ergative languages in the sense of Marantzs (1984) ergativity hypothesis. Unambiguous evidence of syntactic ergativity only appears in complex sentences; thus, children cannot use data within simple, active sentences to establish whether or not their language is syntactically ergative. Children acquiring languages with ergative syntactic constructions must learn when the direct object of a transitive verb functions as a syntactic pivot. Acquisition data for ergative syntactic constructions in K'iche' and Kaluli suggest that children initially fail to recognize ergative constraints on syntactic rules. This finding supports semantic bootstrapping as an acquisition mechanism for the initial construction of syntactic structure.