TL;DR: The Gradual Learning Algorithm (GLA) as mentioned in this paper is a constraint-ranking algorithm for learning optimality-theoretic grammars, which can learn free variation, deal effectively with noisy learning data, and account for gradient well-formedness judgments.
Abstract: The Gradual Learning Algorithm (Boersma 1997) is a constraint-ranking algorithm for learning optimality-theoretic grammars. The purpose of this article is to assess the capabilities of the Gradual Learning Algorithm, particularly in comparison with the Constraint Demotion algorithm of Tesar and Smolensky (1993, 1996, 1998, 2000), which initiated the learnability research program for Optimality Theory. We argue that the Gradual Learning Algorithm has a number of special advantages: it can learn free variation, deal effectively with noisy learning data, and account for gradient well-formedness judgments. The case studies we examine involve Ilokano reduplication and metathesis, Finnish genitive plurals, and the distribution of English light and dark /l/.
TL;DR: This work proposes a theory of phonotactic grammars and a learning algorithm that constructs such Grammars from positive evidence, and applies the model in a variety of learning simulations, showing that the learnedgrammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactics experiment.
Abstract: The study of phonotactics is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. The grammars assess possible words on the basis of the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE -style constraint format, suffices to learn many phonotactic phenomena. In order for the model to learn nonlocal phenomena such as stress and vowel harmony, it must be augmented with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model in a variety of learning simulations, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
TL;DR: The experimental studies in this thesis demonstrate that an experimental investigation of gradient phenomena can advance linguistic theory by uncovering acceptability distinctions that have gone unnoticed in the theoretical literature, and identify a set of general properties of gradient data that seem to be valid for a wide range of syntactic phenomena and across languages.
Abstract: This thesis deals with gradience in grammar, i.e., with the fact that some linguistic structures are not fully acceptable or unacceptable, but receive gradient linguistic judgments. The importance of gradient data for linguistic theory has been recognized at least since Chomsky’s Logical Structure of Linguistic Theory. However, systematic empirical studies of gradience are largely absent, and none of the major theoretical frameworks is designed to account for gradient data. The present thesis addresses both questions. In the experimental part of the thesis (Chapters 3–5), we present a set of magnitude estimation experiments investigating gradience in grammar. The experiments deal with unaccusativity/unergativity, extraction, binding, word order, and gapping. They cover all major modules of syntactic theory, and draw on data from three languages (English, German, and Greek). In the theoretical part of thesis (Chapters 6 and 7), we use these experimental results to motivate a model of gradience in grammar. This model is a variant of Optimality Theory, and explains gradience in terms of the competition of ranked, violable linguistic constraints. The experimental studies in this thesis deliver two main results. First, they demonstrate that an experimental investigation of gradient phenomena can advance linguistic theory by uncovering acceptability distinctions that have gone unnoticed in the theoretical literature. An experimental approach can also settle data disputes that result from the informal data collection techniques typically employed in theoretical linguistics, which are not well-suited to investigate the behavior of gradient linguistic data. Second, we identify a set of general properties of gradient data that seem to be valid for a wide range of syntactic phenomena and across languages. (a) Linguistic constraints are ranked, in the sense that some constraint violations lead to a greater degree of unacceptability than others. (b) Constraint violations are cumulative, i.e., the degree of unacceptability of a structure increases with the number of constraints it violates. (c) Two constraint types can be distinguished experimentally: soft constraints lead to mild unacceptability when violated, while hard constraint violations trigger serious unacceptability. (d) The hard/soft distinction can be diagnosed by testing for effects from the linguistic context; context effects only occur for soft constraints; hard constraints are immune to contextual variation. (e) The soft/hard distinction is crosslinguistically stable. In the theoretical part of the thesis, we develop a model of gradient grammaticality that borrows central concepts from Optimality Theory, a competition-based grammatical framework. We propose an extension, Linear Optimality Theory, motivated by our experimental results on constraint ranking and the cumulativity of violations. The core assumption of our
TL;DR: This paper proposed a constraint ranking algorithm based on Tesar and Smolensky's Constraint Demotion, which mimics the early, "phonotactics only" form of learning seen in infants.
Abstract: Recent experimental work indicates that by the age of ten months, infants have already learned a great deal about the phonotactics (legal sounds and sound sequences) of their language. This learning occurs before infants can utter words or apprehend most phonological alternations. I will show that this early learning stage can be straightforwardly modeled with Optimality Theory. Specifically, the Markedness and Faithfulness constraints can be ranked so as to characterize the phonotactics, even when no information about morphology or phonological alternations is yet available. I will also show how later on, the information acquired in infancy can help the child in coming to grips with the alternation pattern. I also propose a procedure for undoing the learning errors that are likely to occur at the earliest stages. There are two specific formal proposals. One is a constraint ranking algorithm, based closely on Tesar and Smolensky’s Constraint Demotion, which mimics the early, “phonotactics only” form of learning seen in infants. I illustrate the algorithm’s effectiveness by having it learn the phonotactic pattern of a simplified language modeled on Korean. The other proposal is that there are three distinct default rankings for phonological constraints: low for ordinary Faithfulness (used in learning phonotactics); low for Faithfulness to adult forms (in the child’s own production system); and high for output-to-output correspondence constraints.