TL;DR: This book provides a critical review of recent theories of semantics-syntax correspondences and makes new proposals for constraints on semantic structure relevant to syntax.
Abstract: This book provides a critical review of recent theories of semantics-syntax correspondences and makes new proposals for constraints on semantic structure relevant to syntax. Data from several languages are presented which suggest that semantic structure in root morphemes is subject to parametric variation which has effect across a variety of verb classes, including locatives, unaccusatives, and psych verbs.The implications for first and second language acquisition are discussed. In particular, it is suggested that different parametric settings may lead to a learnability problem if adult learners do not retain access to sensitivity to underlying semantic organization and morphological differences between languages provided by Universal Grammar. An experiment with Chinese-speaking learners of English is presented which shows that learners initially transfer L1 semantic organization to the L2, but are able to retreat from overgeneralisations and achieve native-like grammars in this area. Suggestions for further research in this rapidly developing area of theory and acquisition research are also made.
TL;DR: This paper examined the hypotheses about how print represents the speech that pre-literate children select when they receive input compatible with several such hypotheses, and found that most pre literate children fail to select phonologically based hypotheses, even when these are available in the input.
Abstract: This research examines the hypotheses about how print represents the speech that preliterate children select when they receive input compatible with several such hypotheses. In Experiment 1, preschoolers were taught to read hat and hats and book and books. Then, in generalization tests, they were probed for what they had learned about the letter s. All of the children were able to transfer to other plurals (e.g., to decide that bikes said “bikes” rather than “bike,” and that dog said “dog” and not “dogs”), but only those who knew the sound of the letter s prior to the experiment were able to decide, for example, that bus said “bus” and not “bug.” The failure to detect the phonemic value of s on the part of alphabetically naive children was replicated in Experiments 2, 3, and 4, which instituted a variety of controls. In Experiment 5, it was found that, although preschoolers who had been taught to read pairs of words distinguished by the comparative affix er (such as small/smaller) were able to generalize to other comparatives (e.g., mean/meaner), they could not generalize to pairs where er had no morphemic value (e.g., corn/corner). A similar failure by alphabetically naive children to detect the syllabic, as compared with the morphemic, status of the superlative affix est was found in Experiment 6. Overall, the results indicate that most preliterate children fail to select phonologically based hypotheses, even when these are available in the input. Instead, they focus on morphophonology and/or semantic aspects of words' referents. The research is couched in terms of the Learnability Theory (LT) (Gold, 1967), which provides a convenient framework for considering a series of interrelated questions about the acquisition of literacy. In particular, it is argued that if the data available to the child includes the pronunciation of written words, the alphabetic principle may be unlearnable, given the hypothesis selection procedures identified in these experiments.
TL;DR: This paper formalized a number of conceptions of triggers with the aim of characterizing convergence of a simple learning algorithm by their presence, and demonstrated how slight changes in linguistic analysis can significantly affect the (non)existence of triggers, looking closely at Gibson and Wexler's word order/V2 parameter space.
Abstract: A major question facing the study of language learnability lies in determining what properties a class of grammars must possess to allow learning to take place. This article continues the inquiry begun by Gibson and Wexler (1994), studying the importance of triggers, sentences that reveal the settings of parameters of grammatical variation. We formalize a number of conceptions of triggers with the aim of characterizing convergence of a simple learning algorithm by their presence. Then we demonstrate how slight changes in linguistic analysis can significantly affect the (non)existence of triggers, looking closely at Gibson and Wexler's word order/V2 parameter space
TL;DR: This work investigates the learnability of a typical description logic, Classic, and shows that Classic sentences are learnable in polynomial time in the exact learning model using equivalence queries and membership queries.
Abstract: Description logics, also called terminological logics, are commonly used in knowledge-based systems to describe objects and their relationships. We investigate the learnability of a typical description logic, Classic, and show that Classic sentences are learnable in polynomial time in the exact learning model using equivalence queries and membership queries (which are in essence, “subsumption queries”—we show a prediction hardness result for the more traditional membership queries that convey information about specific individuals). We show that membership queries alone are insufficient for polynomial time learning of Classic sentences. Combined with earlier negative results (Cohen & Hirsh, 1994a) showing that, given standard complexity theoretic assumptions, equivalence queries alone are insufficient (or random examples alone in the PAC setting are insufficient), this shows that both sources of information are necessary for efficient learning in that neither type alone is sufficient. In addition, we show that a modification of the algorithm deals robustly with persistent malicious two-sided classification noise in the membership queries with the probability of a misclassification bounded below 1/2. Other extensions are considered.
TL;DR: A. E. Archibald as mentioned in this paper, The Acquisition of Stress, the Acquisition of Negative Constraints, the OCP and Underspecified Representations, and the Variability in a Deterministic Model of Language Acquisition: A Theory of Segmental Elaboration.
Abstract: Contents: Preface. J. Archibald, Introduction: Phonological Competence. B.E. Dresher, H. van der Hulst, Global Determinacy and Learnability in Phonology. K. Rice, P. Avery, Variability in a Deterministic Model of Language Acquisition: A Theory of Segmental Elaboration. E.J. Fee, Segments and Syllables in Early Language Acquisition. D. Ingram, The Acquisition of Negative Constraints, the OCP and Underspecified Representations. J. Archibald, The Acquisition of Stress. K. Demuth, The Acquisition of Tonal Systems. D.A. Dinnsen, S.B. Chin, On the Naural Domain of Phonological Disorders. E. Broselow, H-B. Park, Mora Conservation in Second Language Prosody. T. Scovel, Differentiation, Recognition and Identification in the Discrimination of Foreign Accents.
TL;DR: The authors study the limitations of the multiplicity automata method and prove that this method cannot be used to resolve the learnable of some other open problems such as the learnability of general DNF formulae or even K-term DNF for k=/spl omega/ (log n) or satisfy-s DNFformulae for s=/spl Omega/(1).
Abstract: The learnability of multiplicity automata has attracted a lot of attention, mainly because of its implications on the learnability of several classes of DNF formulae. The authors further study the learnability of multiplicity automata. The starting point is a known theorem from automata theory relating the number of states in a minimal multiplicity automaton for a function f to the rank of a certain matrix F. With this theorem in hand they obtain the following results: a new simple algorithm for learning multiplicity automata with a better query complexity. As a result, they improve the complexity for all classes that use the algorithms of Bergadano and Varricchio (1994) and Ohnishi et al. (1994) and also obtain the best query complexity for several classes known to be learnable by other methods such as decision trees and polynomials over GF(2). They prove the learnability of some new classes that were not known to be learnable before. Most notably, the class of polynomials over finite fields, the class of bounded-degree polynomials over infinite fields, the class of XOR of terms, and a certain class of decision trees. While multiplicity automata were shown to be useful to prove the learnability of some subclasses of DNF formulae and various other classes, they study the limitations of this method. They prove that this method cannot be used to resolve the learnability of some other open problems such as the learnability of general DNF formulae or even K-term DNF for k=/spl omega/ (log n) or satisfy-s DNF formulae for s=/spl omega/(1). These results are proven by exhibiting functions in the above classes that require multiplicity automata with superpolynomial number of states.
TL;DR: This research discusses how the ITS Authoring system approaches overall goals in terms of the four functional components of ITSs: the learning environment, the domain model, the teaching model, and the student model.
Abstract: While intelligent tutoring systems are becoming more common and proving to be increasingly effective, each one must still be built from scratch at a significant cost. ITS Authoring tools, developed to address this issue, have a variety of purposes and intended users, and their design must account for tradeoffs among four overall goals: scope, depth, learnability, and productivity. We discussed how our system approaches these overall goals in terms of the four functional components of ITSs: the learning environment, the domain model, the teaching model, and the student model. Our research tires to find a balance among the goals which will yield a high level of all four, or at least investigate the possibility of excelling in all four areas.
TL;DR: It is shown here that--contrary to much skepticism in contemporary linguistic theories--analogy, induction, distribution and association are powerful sources of information and can, in concert with general cognitive innateknowledge, but without language-specific innate knowledge, learn important properties of language that are thought to be innate by many.
Abstract: This dissertation is concerned with how ambiguity and ambiguity resolution are learned, that is, with the acquisition of the different representations of ambiguous linguistic forms and the knowledge necessary for selecting among them in context. Despite much separate research on acquisition and ambiguity, there is little work on how successful acquisition is possible for ambiguous forms.
The dissertation presents three models of ambiguity acquisition. In TAGSPACE, a model of syntactic categorization, unsupervised classification groups tokens into syntactic categories according to distributional similarity. An evaluation on the Brown corpus demonstrates successful acquisition of the major syntactic categories of English. In WORDSPACE, a model of semantic categorization, semantic categories are induced by unsupervised classification of associational representations of contexts. The model achieves high accuracy when evaluated on a set of ambiguous words in a New York Times corpus. The third acquisition problem addressed is subcategorization. A connectionist model predicts subcategorization frames from lexicosemantic representations. The model learns to form internal verb representations depending on context (i.e. disambiguate) and to generalize the subcategorization behavior of 178 English verbs in the Dative Alternation class.
Ambiguity is important for theories of linguistic representation. Disambiguation is difficult with symbolic representations since all-or-none criteria like grammaticality or soundness eliminate few readings. In contrast, disambiguation in both TAGSPACE and WORDSPACE relies crucially on proximity relations that can only be modeled by gradient representations. In subcategorization learning, gradience solves the transition problem, the question of how the child makes the gradual transition from a state with little knowledge to adult performance.
Ambiguity is also relevant for linguistic innateness. Innate categories do not explain learnability without an account of how they are grounded in perception. The grounding problem is hard for ambiguous forms with their many possible groundings. It is shown here that--contrary to much skepticism in contemporary linguistic theories--analogy, induction, distribution and association are powerful sources of information and can, in concert with general cognitive innate knowledge, but without language-specific innate knowledge, learn important properties of language that are thought to be innate by many.
TL;DR: Deaf college students' knowledge of English wh-question formation in the context of government-binding theory and an associated learnability theory is explored, revealing that, despite years of exposure to English language input, many deaf learners have not internalized the positive evidence required to set the marked values of the wh- Question parameters.
Abstract: This article explores deaf college students’ knowledge of English wh-question formation in the context of government-binding theory and an associated learnability theory. The parameters of universa...
TL;DR: An information-theoretic characterization of k-RFA learnability is developed upon which a general tool for proving hardness results are built, and it is shown that, unlike the PAC model, weak learning does not imply strong learning in thek -RFA model.
Abstract: In the k-Restricted-Focus-of-Attention (k-RFA) model, only k of the n attributes of each example are revealed to the learner, although the set of visible attributes in each example is determined by the learner. While thek -RFA model is a natural extension of the PAC model, there are also significant differences. For example, it was previously known that learnability in this model is not characterized by the VC-dimension and that many PAC learning algorithms are not applicable in the k-RFA setting.
In this paper we further explore the relationship between the PAC andk -RFA models, with several interesting results. First, we develop an information-theoretic characterization of k-RFA learnability upon which we build a general tool for proving hardness results. We then apply this and other new techniques for studying RFA learning to two particularly expressive function classes,k -decision-lists (k-DL) and k-TOP, the class of thresholds of parity functions in which each parity function takes at most k inputs. Among other results, we prove a hardness result for k-RFA learnability of k-DL,k ≤ n-2 . In sharp contrast, an (n-1)-RFA algorithm for learning (n-1)-DL is presented. Similarly, we prove that 1-DL is learnable if and only if at least half of the inputs are visible in each instance. In addition, we show that there is a uniform-distribution k-RFA learning algorithm for the class ofk -DL. For k-TOP we show weak learnability by ak -RFA algorithm (with efficient time and sample complexity for constant k) and strong uniform-distribution k-RFA learnability of k-TOP with efficient sample complexity for constant k. Finally, by combining some of our k-DL and k-TOP results, we show that, unlike the PAC model, weak learning does not imply strong learning in thek -RFA model.
TL;DR: This paper investigated the applicability of the Subset Principle in the second language (L2) acquisi:ion of the Oblique-Case Parameter by 45 learners of French and found that learners acquired the lack of Exceptional-Case marking and preposition stranding, two of the syntactic properties tested, based on the positive evidence available to them.
Abstract: This study investigates the applicability of the Subset Principle in the second language (L2) acquisi:ion of the Oblique-Case Parameter by 45 learners of French. First, the Subset Principle is defined and discussed, along with its learnability predictions in first language (L1) acquisition. Then, a brief overview of the relevant literature in L2 acquisition shows that the applicability of the Subset Principle is very much debated. In the present study, the results of a grammaticality judgment task and a correction task provide partial support for the Subset Principle. It seems that the learners have acquired the lack of Exceptional-Case marking and preposition stranding, two of the syntactic properties tested, based on the positive evidence available to them. However, they failed to reject a number of ungrammatical instances of dative alternation and dative passive, leading them to an overgeneralized grammar. It is suggested that L2 learners may need direct or indirect negative evidence to constrain their grammar. Further research is needed to conclude whether the Oblique-Case Parameter really is a parameter of Universal Grammar, and if so, whether adult L2 learners are able to reset their parameters to the proper target language values.
TL;DR: In this paper, the learnability of branching programs and small-depth circuits with modular and threshold gates in both the exact and PAC learning models with and without membership queries was studied. And the results extend earlier works and exhibit further applications of multiplicity automata in learning theory.
Abstract: We study the learnability of branching programs and small-depth circuits with modular and threshold gates in both the exact and PAC learning models with and without membership queries. Our results extend earlier works [11, 18, 15] and exhibit further applications of multiplicity automata [7] in learning theory.
TL;DR: In this article, the benefits and limitations of the Layered protocols (LP) model for the analysis and design of user interfaces in the field of consumer electronics are discussed. But the usability of the user interface of an existing digital audio recorder which is only partly in line with the LP model is compared with an interface designed according to the model.
Abstract: An assessment is presented of the benefits and limitations of the Layered Protocols (LP) model for the analysis and design of user interfaces in the field of consumer electronics . In the assessment a user interface of an existing digital audio recorder which is only partly in line with the LP model is compared with an interface designed according to the model . The observed dif ferences in usability between the two interfaces are mainly caused by deviations from the LP model . It turned out that especially the learnability of an interface is positively influenced by a layered organization of user-system interaction in combination with high-quality E- and I-feedback and optimum similarity between interaction protocols . ÷ 1996 Academic Press Limited User-system interaction models and user interface design guidelines can be useful tools for designing better conceived and more usable interfaces (Marshall , Nelson & Gardiner , 1987 ; HUSAT , 1988) . They can be of help in structuring the interaction between the user and the system and in explaining the role of the information exchanged . Ways of increasing usability are indicated on the basis of implicitly or explicitly incorporated cognitive principles . In this way , validated interaction models and guidelines can help in reaching a basic level of usability while diminishing the need for extensive user testing . In this paper the validity of a specific user-system interaction model is assessed : the Layered Protocols (LP) model (Taylor , 1988 a , 1992 . The assessment is carried out in the field of consumer electronics : digital audio recorders . The application area of consumer electronics dif fers from the professional setting with respect to training . Users operating a digital audio recorder have usually had no formal instruction or coaching . Therefore , more weight is attached to learnability than to ef ficiency as a usability criterion (Nielsen , 1993) . The LP model is based upon the cognitive principle that humans use superimposed layers of abstraction in perception and performance . From this principle the LP model arrives at an architecture for structuring user-system interaction . The model explicitly addresses the user’s and the system’s contribution to the interaction and the way in which they relate to each other . In this paper guidelines indicating potential usability consequences are derived from the model . That way the model can be validated . The LP model was chosen because it seemed somewhat more applicable in consumer electronics than some alternative models . Other models like Cognitive
TL;DR: In this paper, a formal theory of the development of semantic behavior of quantier denotations is presented, where the authors take as their starting point the acquisition of quantiers and propose an algorithm to learn first-order quantiers.
Abstract: Learning First Order Quantier DenotationsAn Essay in Semantic LearnabilityRobin ClarkDepartment of LinguisticsUniversityof PennsylvaniaPhiladelphia PA rclarkbab ellingup enneduIntro ductionThis pap er is a rst attempt at a formal theory of the development of semantic b ehaviorWe will take as our starting p oint the acquisition of quantier denotations This might atrst blush seem a curious p oint of departure The past two decades have hoever seenas explosivegrowth in the study of quantiers and the theory of generalized quantiersis now suciently mature that its learnability prop erties can b e studied As we shall seean algorithm exists that will learn the rstorder quantiers higher order quantiers likemost will present a greater challenge howeverIt should come as no surprise that the learnability prop erties of quantied expressionshave not b een the sub ject of widespread inestigationWhat after all do quantiedexpressions refer to The question has b een a matter of debate since at least the scholasticphilosophers of the middle ages who to ok it as problem of the reference of general andparticular terms Loux Consider a recent example of the discussionIf a class were taken as consisting of its memb ers there could b e no place foranull class in logic when nothing or no man stands as a grammaticalsub ject it is ridiculous to ask what it refers toAlthough it might seemsensible to ask which p ortion of the class of men is constituted by the menreferred to as all men or some men wemay b e led to doubt the legitimacyof this question if we once think of comparing the adjectival uses of allsome no and aloneall men laugh some men laugh no men laughmen alone laughwe see that none of these has the role of marking out partof a class
TL;DR: A novel use of the TAG (Task-Action Grammar) method as an evaluation tool to identify learnability problems in a interface prior to user testing finds that the TAG assessment method revealed problems related to inconsistency and conceptual complexity in a design.
Abstract: Many approaches to evaluation of user interfaces in HCI exist. Most methodologies evaluate prototypes through user testing. Since user testing is expensive and time consuming, other methods are sometimes applied before user testing. These other approaches aim to identify problems through expert evaluation techniques. In this paper we present a novel use of the TAG (Task-Action Grammar) method as an evaluation tool to identify learnability problems in a interface prior to user testing. We find that the TAG assessment method revealed problems related to inconsistency and conceptual complexity in a design.
TL;DR: The Web StoryBase is described, a system using HTML forms technology to collect and share stories and story annotations from users of the World Wide Web, and usage data collected over a period of 26 weeks is analysed.
Abstract: We describe the Web StoryBase, a system using HTML forms technology to collect and share stories and story annotations from users of the World Wide Web. We analyse usage data collected over a period of 26 weeks, from the perspective of how the system was advertised, contributed to, and browsed. We also discuss several themes extracted from the reported Web experiences: usability, learnability, diversity, communication, just-in-time information, capture and fun.
TL;DR: This work investigates the learnability, under the uniform distribution, of neural concepts that can be represented as simple combinations of nonoverlapping perceptrons (also called μ-perceptrons) with binary weights and arbitrary thresholds.
TL;DR: Evidence is provided that the students could learn to use the software in a short period of time and that they could undertake work of at least potential educational value with the tool.
Abstract: This research is about the design, development and testing of a semiquantitative
computer modelling tool called Linklt.
The aim of the software is to provide secondary students with a computational
environment where they can think at a system level about models and the
modelling process by expressing and testing their own ideas about phenomena
without having to pay attention to the analytical relations between the variables
involved.
The research involved two exploratory studies using two different versions of
the software. These studies were carried out in Rio de Janeiro - Brazil with
students aged 13-18 years old. During the studies, the students worked in pairs
and used the computer tool to perform the expressive and exploratory tasks
presented to them. The interviews were tape-recorded, some were also videorecorded
and the models used and created by the students together with the
steps to create them were saved and used for later analysis.
The design of the first version of the software - Linklt I - was based on an
analysis of another computer modelling system called IQON. This first version of
the software was then tested with students during a Preliminary study.
The analysis of the data collected led to a rethinking of the conceptual model
of the system. A new interface and changes in the properties of the objects of
the system were discussed and implemented, resulting in a new version of the
software: Linklt II .
The second (Core) study aimed to investigate students' success and failure
with the new version of the system, paying attention to the ease of use and
learnability of the software, as well as to how they explored and externalised
their ideas when using it. The analysis of this study provided evidence that the
students could learn to use the software in a short period of time and that they
could undertake work of at least potential educational value with the tool.
TL;DR: Results are presented that relate topological properties of learnable classes to that of intrinsic complexity and ordinal mind change complexity and it is shown that a clam that is complete according to the reductions for intrinsic complexity has infinite elasticity.
Abstract: Recently, rich subclasses of elementary formal systems (EFS) have been shown to be identifiable in the limit from only positive data. Examples of these classes are Angluin's pattern languages, unions of pattern languages by Wright and Shinohara, and classes of languages definable by length-bounded elementary formal systems studied by Shinohara. The present paper employs two distinct bodies of abstract studies in the inductive inference literature to analyze the learnability of these concrete classes. The first approach uses constructive ordinals to bound the number of mind changes.ωdenotes the first limit ordinal. An ordinal mind change bound ofωmeans that identification can be carried out by a learner that after examining some element(s) of the language announces an upper bound on the number of mind changes it will make before converging; a bound ofω·2 means that the learner reserves the right to revise this upper bound once; a bound ofω·3 means the learner reserves the right to revise this upper bound twice, and so on. A bound ofω2means that identification can be carried out by a learner that announces an upper bound on the number of times it may revise its conjectured upper bound on the number of mind changes. It is shown in the present paper that the ordinal mind change complexity for identification of languages formed by unions of up to n pattern languages isωn. It is also shown that this bound is essential. Similar results are also shown to hold for classes definable by length-bounded elementary formal systems with up to n clauses. The second approach employs reductions to study the intrinsic complexity of learnable classes. It is shown that the class of languages formed by taking unions of up ton+1 pattern languages is a strictly more difficult learning problem than the class of languages formed by the union of up tonpattern languages. It is also shown that a similar hierarchy holds for the bound on the number of clauses in the case of languages definable by length-bounded EFS. In addition to building bridges between three distinct areas of inductive inference, viz., learnability of EFS subclasses, ordinal mind change complexity, and intrinsic complexity, this paper also presents results that relate topological properties of learnable classes to that of intrinsic complexity and ordinal mind change complexity. For example, it is shown that a class that is complete according to the reductions for intrinsic complexity has infinite elasticity. Since EFS languages and their learnability results have counterparts in traditional logic programming, the present paper demonstrates the possibility of using abstract results of inductive inference to gain insights into inductive logic programming.
TL;DR: A surprising characterization of hypersimple sets in algorithmic learning theory is presented to obtain an elegant, tight separation result for learnability criteria and it is argued that such separation results may yield insight for eventual characterizations.
TL;DR: This work considers the learnability of classes of logic programs in the presence of noise, and shows that arbitrary nonrecursive Horn clauses with forest background knowledge remain polynomially PAC learnable in the absence of noise.
Abstract: We consider the learnability of classes of logic programs in the presence of noise, assuming that the label of each example is reversed with a fixed probability. We review the polynomial PAC learnability of nonrecursive, determinate, constant-depth Horn clauses in the presence of such noise. This result is extended to an analogous class of recursive logic programs that consist of a recursive clause, a base case clause, and ground background knowledge. Also, we show that arbitrary nonrecursive Horn clauses with forest background knowledge remain polynomially PAC learnable in the presence of noise. We point out that the sample size can be decreased by using dependencies among the literals.
TL;DR: This work proves the learn ability of Inductive Logic t~e limit, whereas most others use polynomial heurisProgramming (ILP) concept classes with retICS for concept induction, and investigates generally efficient learners, but shows how they can also derive negative Horn clauses.
Abstract: low two different approaches to hypothesis production. MIS and CLINT, for instance, identify the target at We study the learn ability of Inductive Logic t~e limit, whereas most others use polynomial heurisProgramming (ILP) concept classes with retICS for concept induction. Consequently, these sysspect to robust-learning. We first investigate tems are generally efficient learners, but, to our knowlthe class of k-Horn clauses, and show that it edge, none can be formally shown to find the target is not learnable in that model. We prove this concept in polynomial time. using a reduction on which we impose as few Simultaneously, theoretical work has allowed to estabconstraints as possible. From this proof, we lish learnability results for some subclasses of first orthen show how we can also derive negative reder Horn clauses. Early studies were undertaken in the suIts for some PAC-learnable classes. Finally, Identification in the limit model (Gold, 1967), which we end by discussing the applicational consedescribes learning as converging towards the target quences of our work and its links with other concept, in finite time but given an unbounded amount learnability studies regarding new learnabilof examples. Schapiro (Schapiro, 1983) identified a ity models for ILP. most general class learnable in this model by a consistent algorithm (MIS) and other studies have since been carried out in this framework (Banerji, 1987),