TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.
TL;DR: A contactless motion detector, comprising an output thyristor controlled by a switching network, has an oscillator coupled to the switching network via a pre-amplifier and a delay unit retards the energization of the pre-Amplifier until the oscillator has reached its operating condition.
Abstract: A contactless motion detector, comprising an output thyristor controlled by a switching network, has an oscillator coupled to the switching network via a pre-amplifier. The oscillator and the pre-amplifier are energized from a source of pulsating direct current via a supply circuit which is part of the switching network and includes a constant-current unit in parallel with the output thyristor. To prevent untimely switching of the thyristor when power is connected to the system, a delay unit retards the energization of the pre-amplifier until the oscillator has reached its operating condition. The delay unit may include one or more semiconductive devices, such as cascaded transistors or diodes, connected across a capacitor of a resistive/capacitive series circuit which bridges a smoothing capacitor inserted between a pair of bus bars.
TL;DR: The authors treated language-impaired children as normal learners dealing with an input that is distorted in principled ways, and viewed the children from this perspective, Pinker's (1984) theory can account for many of the features of their language.
Abstract: Theories of language learnability have focused on “normal” language development, but there is a group of children, termed “specifically language-impaired,” for whom these theories are also appropriate. These children present an interesting learnability problem because they develop language slowly, the intermediate points in their development differ in certain respects from the usual developmental stages, and they do not always achieve the adult level of language functioning. In this article, specifically language-impaired children are treated as normal learners dealing with an input that is distorted in principled ways. When the children are viewed from this perspective, Pinker's (1984) theory can account for many of the features of their language.
TL;DR: The authors discusses the trigger experience, the experience that actually affects a child's linguistic development and argues that much of what a child hears has no consequence for the form of the eventual grammar.
Abstract: According to a “selective” (as opposed to “instructive”) model of human language capacity, people come to know more than they experience. The discrepancy between experience and eventual capacity (the “poverty of the stimulus”) is bridged by genetically provided information. Hence any hypothesis about the linguistic genotype (or “Universal Grammar,” UG) has consequences for what experience is needed and what form people's mature capacities (or “grammars”) will take. This BBS target article discusses the “trigger experience,” that is, the experience that actually affects a child's linguistic development. It is argued that this must be a subset of a child's total linguistic experience and hence that much of what a child hears has no consequence for the form of the eventual grammar. UG filters experience and provides an upper bound on what constitutes the triggering experience. This filtering effect can often be seen in the way linguistic capacity can change between generations. Children only need access to robust structures of minimal (“degree-0”) complexity. Everything can be learned from simple, unembedded “domains” (a grammatical concept involved in defining an expression's logical form). Children do not need access to more complex structures.
TL;DR: It is shown that these two notions of learnability are equivalent and an explicit method is described for directly converting a weak learning algorithm into one that achieves arbitrarily high accuracy.
Abstract: The problem of improving the accuracy of a hypothesis output by a learning algorithm in the distribution-free learning model is considered. A concept class is learnable (or strongly learnable) if, given access to a source of examples from the unknown concept, the learner with high probability is able to output a hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce a hypothesis that forms only slightly better than random guessing. It is shown that these two notions of learnability are equivalent. An explicit method is described for directly converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences. >
TL;DR: In this article, the authors address the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model and present a method for converting a weak learning algorithm into one that achieves arbitrarily high accuracy.
Abstract: This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.
A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
TL;DR: Assume, then, that grammar construction is mediated by principles in such a way that a learner is programmed to search out those data that indicate their manner of implementation, and that Universal Grammar will consist of invariant principles which dictate the form grammars can take.
Abstract: Discussions of the discrepancy between grammatical knowledge and its experiential basis in the primary data of acquisition usually focus on the question of how to account for the attainment of the correct adult grammar. Yet this disparity between knowledge and input data is also apparent in provisional solutions learners arrive at in formulating a grammar. As Lightfoot (1982) points out, of the many logically possible interim solutions that could be derived from the data, only a narrow range is actually attested. Not infrequently these have no obvious model in the target or the first language. It is noncontroversial, by now, that no known inductive learning procedure can satisfactorily account for these salient aspects of acquisition and that innate, specifically linguistic, constraints must mediate the acquisition process. These constraints define the domain for a theory of Universal Grammar. We can conceive of these constraints as grammatical principles. Universal Grammar will consist of invariant principles which dictate the form grammars can take. Besides these, there are other principles admitting of variation. The manner of their implementation gives rise to the typological variation observable in languages. Since typological variation is sharply circumscribed, and since there is no historical evidence that new linguistic types have evolved (Wode 1981), these principles, too, appear to form part of our biological endowment. Assume, then, that grammar construction is mediated by principles in such a way that a learner is programmed to search out those data that indicate their manner of implementation.
TL;DR: A syntax induction experiment in which subjects learn to operate a toy “lost property” computer system by a lexical command language is tested, finding that the results of the experiment are consistent with the predictions of task-action grammar.
Abstract: Task-action grammar (TAG), a formal model of the mental representation of task languages, makes predictions about the relative learnability of different command language structures. In particular TAG predicts that consistent structuring of task-action mappings across semantic domains of the task world will facilitate learning, but that consistent structuring within domains that are orthogonal to the semantic organisation of the task world cannot be accommodated within users' mental representations, and so will not help learners. Other models of humancomputer interaction either fail to address this distinction, or make quite different predictions. The prediction is tested by a syntax induction experiment in which subjects learn to operate a toy “lost property” computer system by a lexical command language. The results of the experiment are consistent with the predictions of task-action grammar.
TL;DR: This chapter discusses the application of learnability theory to the Rationalism-Empiricism Controversy, and some problems in the Parametric Analysis of Learnability.
Abstract: Introduction: Learnability and Linguistic Theory.- Learning Theory and Natural Language.- The Plausibility of Rationalism.- On Applying Learnability Theory to the Rationalism-Empiricism Controversy.- On Certain Substitutes for Negative Data.- Markedness and Language Development.- Learning the Periphery.- Some Problems in the Parametric Analysis of Learnability.- From Cognition to Thematic Roles: The Projection Principle as an Acquisition Mechanism.- List of Contributors.- Index of Names.- Index of Subjects.
TL;DR: The authors considers some of the claims for learnability principles that have been proposed within the first language (L1) context and the problems associated with their application to SLA, and examines four second languages (L2) phenomena with respect to a promising variety of learnability theory labeled preemption.
Abstract: Although most of the theoretical work in second language acquisition (SLA) over the years has led to advancement of theories of developing grammars, some recent SLA research has begun to investigate how those grammars are actually learned, relevant to current theories of learnability. This article (a) considers some of the claims for learnability principles that have been proposed within the first language (L1) context and the problems associated with their application to SLA, (b) examines four second language (L2) phenomena with respect to a promising variety of learnability theory labeled preemption, and (c) suggests in what ways research on learnability in SLA can contribute to the further development of learnability theory in general.
TL;DR: The authors argue that if the learner does not have access to negative evidence, then Universal Grammar presumably does not make available choices that can only be resolved by such evidence, and the concern is exclusively with the situation schematized in (1).
Abstract: Much of the recent discussion of language learnability has centered around the absence for the learner of negative evidence and the implications of that absence. The basic argument has been reiterated many times: If the child does not have access to negative evidence — the information that certain structures are not part of the language — then Universal Grammar presumably does not make available choices that can only be resolved by such evidence. (See Chomsky and Lasnik (1977) for early discussion.) In principle, the concern is exclusively with the situation schematized in (1).
TL;DR: It is pointed out that the condition for learnability obtained in [4] is equivalent to the notion of finite metric entropy (which has been studied in other contexts) and some relationships, in addition to those shown in], between the VC dimension of a concept class and its metric entropy with respect to various distributions are discussed.
Abstract: In [23], Valiant proposed a formal framework for distribution-free concept learning which has generated a great deal of interest. A fundamental result regarding this framework was proved by Blumer et al. [6] characterizing those concept classes which are learnable in terms of their Vapnik-Chervonenkis (VC) dimension. More recently, Benedek and Itai [4] studied learnability with respect to a fixed probability distribution (a variant of the original distribution-free framework) and proved an analogous result characterizing learnability in this case. They also stated a conjecture regarding learnability for a class of distributions. In this report, we first point out that the condition for learnability obtained in [4] is equivalent to the notion of finite metric entropy (which has been studied in other contexts). Some relationships, in addition to those shown in [4], between the VC dimension of a concept class and its metric entropy with respect to various distributions are then discussed. Finally, we prove some partial results regarding learnability for a class of distributions.
TL;DR: A novel ‘solid learnability’ notion is presented that captures the difference between ‘Guess and Test’ learning algorithms and learnability notions for which consistency with the samples guarantees success.
Abstract: We present a systematic framework for classifying, comparing and defining models of computational learnability. Apart from the obvious ‘uniformity’ parameters we present a novel ‘solid learnability’ notion that captures the difference between ‘Guess and Test’ learning algorithms and learnability notions for which consistency with the samples guarantees success.
TL;DR: The theory of language learning must provide an explanation of how the differences in languages can be learned, since these differences are not universal, they cannot be innate.
Abstract: The theory of language learning finds both a problem area and a source of energy in the tension between similarity and diversity of natural language. On the one hand, different languages show a great underlying similarity of structure, as demonstrated by contemporary advances in linguistic theory. To the extent that properties of different languages are similar, they can (at least as a first hypothesis, subject to further evidence) be taken to be innate. In this respect, the language learning problem is solved. On the other hand, there are clear and systematic differences between different natural languages. Since the field proceeds from the solidly based fact that any normal child can learn any natural language, the theory of language learning must provide an explanation of how the differences in languages can be learned. Since these differences are not universal, they cannot be innate.
TL;DR: There are currently two different linguistically-based approaches to universals in second language acquisition, one stemming from typological universals (Greenberg, 1966) and the other from Chomskyan Universal Grammar as discussed by the authors.
Abstract: There are currently two different linguistically-based approaches to universals in second language acquisition, one stemming from typological universals (Greenberg, 1966) and the other from Chomskyan Universal Grammar. Associated with each approach is a concept of markedness. Typologists define markedness implicationally; current theories of language learnability define markedness in terms of the Subset Principle. Although coming from very different perspectives, these two definitions of markedness coincide in a number of predictions they make for L1 and L2 acquisition. Similarities and differences between these two approaches to markedness and acquisition are discussed in this paper.
TL;DR: The present work represents an interesting class of countably infinite concepts for which the questions of learnability have been nearly completely characterized, and demonstrates how various proof techniques developed by Pitt and Valiant, Blumer et al.
Abstract: We characterize learnability and non-learnability of subsets of N m called ‘semilinear sets’, with respect to the distribution-free learning model of Valiant. In formal language terms, semilinear sets are exactly the class of ‘letter-counts’ (or Parikh-images) of regular sets. We show that the class of semilinear sets of dimensions 1 and 2 is learnable, when the integers are encoded in unary. We complement this result with negative results of several different sorts, relying on hardness assumptions of varying degrees – from P ≠ NP and RP ≠ NP to the hardness of learning DNF. We show that the minimal consistent concept problem is NP-complete for this class, verifying the non-triviality of our learnability result. We also show that with respect to the binary encoding of integers, the corresponding ‘prediction’ problem is already as hard as that of DNF, for a class of subsets of N m much simpler than semilinear sets. The present work represents an interesting class of countably infinite concepts for which the questions of learnability have been nearly completely characterized. In doing so, we demonstrate how various proof techniques developed by Pitt and Valiant [15], Blumer et al. [3], and Pitt and Warmuth [17] can be fruitfully applied in the context of formal languages.
TL;DR: Applications of the directed dialogue that have identified design choices which build learnability and usability into a product's user-interface are discussed.
Abstract: The development of an interface design tool called “directed dialogue protocols” is discussed. The tool is based upon Kato's (1986) method of verbal data collection, “question-asking protocols.” Three extensions to the question-asking method are detailed: 1) an experimental procedure of atomic tasks which facilitate the quantization of verbal data; 2) interventions by the experimenter that probe the subject's expectations and prompt verbalizations; and 3) a technique for answering subject queries called sequential disclosure. Also discussed are applications of the directed dialogue that have identified design choices which build learnability and usability into a product's user-interface.
TL;DR: In this article, it is shown that a given categorical relation is decomposable into a tree of binary relations and, if the answer is positive, identifying the topology of such a tree.
Abstract: This paper summarizes several investigations into the prospects of identifying meaningful structures in empirical data. Starting with an early work on identifying probabilistic trees, we extend the method to polytrees (directed trees with arbitrary edge orientation) and show that, under certain conditions, the skeleton of the polytree as well as the orientation of some of the arrows, are identifiable. We next address the problem of identifying probabilistic trees in which some of the nodes are unobservable. It is shown that such trees can be effectively identified in cases where all variables are either bi-valued or normal, and where all correlation coefficients are known precisely. Finally, it is shown that an effective procedure exists for determining whether a given categorical relation is decomposable into a tree of binary relations and, if the answer is positive, identifying the topology of such a tree. Guided by these results, we then propose a general framework whereby the notion of identifiability is given a precise formal definition, similar to that of learnability.
TL;DR: Upper and lower bounds on the VC dimension of sparse univariate polynomials over reals are proved, and these results are applied to prove uniform learnability of sparse polynmials and rational functions.
Abstract: We prove upper and lower bounds on the VC dimension of sparse univariate polynomials over reals, and apply these results to prove uniform learnability of sparse polynomials and rational functions. As another application we solve an open problem of Vapnik ([Vapnik 82]) on uniform approximation of the general regression functions, a central problem of computational statistics (cf. [Vapnik 82]), p. 256).
TL;DR: Formal learning theory studies the learnability of different classes of formal objects under different formal models of learning to determine the class that can be acquired to the level of the specified success criterion by a learner implementing the specified strategy in the specified enviroment.
Abstract: Formal learning theory, as the name suggests, studies the learnability of different classes of formal objects (languages, grammars, theories, etc.) under different formal models of learning. The specification of such a model, which specifies (a) a learning environment, (b) a learning strategy, and (c) a criterion for successful learning, determines (d) a class of formal objects, namely, the class that can be acquired to the level of the specified success criterion by a learner implementing the specified strategy in the specified enviroment.
TL;DR: A distinction is explored between two theories of language acquisition, one which is based on universal grammar {UG), and one which are based on learnability theory {LT), which has been widely examined in both first and second language acquisition.
Abstract: It is now relatively uncontroversial that innate principles must be involved in language acquisition, but a crucial issue is the nature of these innate principles, in particular whether they are formulated as constraints on language or constraints on learning, or both (Wexler and Manzini 1987), and whether they are specific to language or can be related to other cognitive domains {O'Grady 1987). That is, are the acquisition mechanisms formulated in terms of linguistic principles, or are they formulated in terms of learning principles which are used to acquire a linguistic system? In either case the system acquired is linguistic, but the principles used to acquire it are not necessarily so, or may be partially so. In this paper I would like to explore a distinction between two theories of language acquisition, one which is based on universal grammar {UG), and one which is based on learnability theory {LT). The theory of universal grammar has been widely examined in both first and second language acquisition, whereas leamability theory is a relatively recent area of study which is being developed both within UG and from other theoretical perspectives (Wexler and Manzini 1987, Pinker 1984, O'Grady 1987, Rumelhart and McClellan 1987). For purposes of clarity, I would like to make a distinction between them as potentially different explanations for the