TL;DR: This paper considers the General Learning Setting (introduced by Vapnik), which includes most statistical learning problems as special cases, and identifies stability as the key necessary and sufficient condition for learnability.
Abstract: The problem of characterizing learnability is the most basic question of statistical learning theory. A fundamental and long-standing answer, at least for the case of supervised classification and regression, is that learnability is equivalent to uniform convergence of the empirical risk to the population risk, and that if a problem is learnable, it is learnable via empirical risk minimization. In this paper, we consider the General Learning Setting (introduced by Vapnik), which includes most statistical learning problems as special cases. We show that in this setting, there are non-trivial learning problems where uniform convergence does not hold, empirical risk minimization fails, and yet they are learnable using alternative mechanisms. Instead of uniform convergence, we identify stability as the key necessary and sufficient condition for learnability. Moreover, we show that the conditions for learnability in the general setting are significantly more complex than in supervised classification and regression.
TL;DR: It is shown that parameters of a Gaussian mixture distribution with fixed number of components can be learned using a sample whose size is polynomial in dimension and all other parameters.
Abstract: The question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary fixed number of components. The result on learning Gaussian mixtures relies on an analysis of distributions belonging to what we call "polynomial families" in low dimension. These families are characterized by their moments being polynomial in parameters and include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time and using a polynomial number of sample points. The result on learning polynomial families is quite general and is of independent interest. To estimate parameters of a Gaussian mixture distribution in high dimensions, we provide a deterministic algorithm for dimensionality reduction. This allows us to reduce learning a high-dimensional mixture to a polynomial number of parameter estimations in low dimension. Combining this reduction with the results on polynomial families yields our result on learning arbitrary Gaussian mixtures in high dimensions.
TL;DR: The problem of characterizing learnability is the most basic question of statistical learning theory and a fundamental and long-standing answer, at least for the case of supervised classification and supervised classification, is found.
Abstract: The problem of characterizing learnability is the most basic question of statistical learning theory. A fundamental and long-standing answer, at least for the case of supervised classification and ...
TL;DR: A theory of online learning is developed by defining several analogues of Rademacher complexity, covering numbers and fat-shattering dimension from statistical learning theory and providing a complete characterization of online learnability in the supervised setting.
Abstract: We develop a theory of online learning by defining several complexity measures. Among them are analogues of Rademacher complexity, covering numbers and fat-shattering dimension from statistical learning theory. Relationship among these complexity measures, their connection to online learning, and tools for bounding them are provided. We apply these results to various learning problems. We provide a complete characterization of online learnability in the supervised setting.
TL;DR: The effects of a Multi-Layered (ML) interface on older adults’ performance in learning tasks on a mobile device provided greater benefit for older participants than for younger participants in terms of task completion time during initial learning, perceived complexity, and preference.
Abstract: Mobile computing devices can offer older adults (ages 65+) support in their daily lives, but older adults often find such devices difficult to learn and use. One potential design approach to improve the learnability of mobile devices is a Multi-Layered (ML) interface, where novice users start with a reduced-functionality interface layer that only allows them to perform basic tasks, before progressing to a more complex interface layer when they are comfortable. We studied the effects of a ML interface on older adults’ performance in learning tasks on a mobile device. We conducted a controlled experiment with 16 older (ages 65--81) and 16 younger participants (age 21--36), who performed tasks on either a 2-layer or a nonlayered (control) address book application, implemented on a commercial smart phone. We found that the ML interface’s Reduced-Functionality layer, compared to the control’s Full-Functionality layer, better helped users to master a set of basic tasks and to retain that ability 30 minutes later. When users transitioned from the Reduced-Functionality to the Full-Functionality interface layer, their performance on the previously learned tasks was negatively affected, but no negative impact was found on learning new, advanced tasks. Overall, the ML interface provided greater benefit for older participants than for younger participants in terms of task completion time during initial learning, perceived complexity, and preference. We discuss how the ML interface approach is suitable for improving the learnability of mobile applications, particularly for older adults.
TL;DR: In this paper, the authors study submodular functions from a learning theoretic angle and uncover several structural results revealing ways in which Submodular Functions can be both surprisingly structured and surprisingly unstructured.
Abstract: Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their learnability. In doing so, we uncover several novel structural results revealing ways in which submodular functions can be both surprisingly structured and surprisingly unstructured. We provide several concrete implications of our work in other domains including algorithmic game theory and combinatorial optimization. At a technical level, this research combines ideas from many areas, including learning theory (distributional learning and PAC-style analyses), combinatorics and optimization (matroids and submodular functions), and pseudorandomness (lossless expander graphs).
TL;DR: In this paper, the authors consider the problem of sequential prediction and provide tools to study the minimax value of the associated game and provide necessary and sufficient conditions for online learnability in the setting of supervised learning.
Abstract: We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game. Classical statistical learning theory provides several useful complexity measures to study learning with i.i.d. data. Our proposed sequential complexities can be seen as extensions of these measures to the sequential setting. The developed theory is shown to yield precise learning guarantees for the problem of sequential prediction. In particular, we show necessary and sufficient conditions for online learnability in the setting of supervised learning. Several examples show the utility of our framework: we can establish learnability without having to exhibit an explicit online learning algorithm.
TL;DR: This work considers the problem of sequential prediction and provides tools to study the minimax value of the associated game and shows necessary and sufficient conditions for online learnability in the setting of supervised learning.
Abstract: We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game. Classical statistical learning theory provides several useful complexity measures to study learning with i.i.d. data. Our proposed sequential complexities can be seen as extensions of these measures to the sequential setting. The developed theory is shown to yield precise learning guarantees for the problem of sequential prediction. In particular, we show necessary and sufficient conditions for online learnability in the setting of supervised learning. Several examples show the utility of our framework: we can establish learnability without having to exhibit an explicit online learning algorithm.
TL;DR: The data are consistent with the hypothesis that center-embedded syntactic structures can be learned in artificial grammar tasks if language-like acoustic cues are provided and mere distributional information does not suffice for successful learning.
TL;DR: The MDL approach is extended to give a simple and practical methodology for estimating how much linguistic data are required to learn a particular linguistic restriction, allowing arguments about natural language learnability to be made explicit and quantified for the first time.
TL;DR: This paper focuses on the learnability issue of conditional preference networks, or CP-nets, that have recently emerged as a popular graphical language for representing ordinal preferences in a concise and intuitive manner and provides results in both passive and active learning.
Abstract: A recurrent issue in decision making is to extract a preference structure by observing the user’s behavior in different situations. In this paper, we investigate the problem of learning ordinal preference orderings over discrete multiattribute, or combinatorial, domains. Specifically, we focus on the learnability issue of conditional preference networks, or CP-nets, that have recently emerged as a popular graphical language for representing ordinal preferences in a concise and intuitive manner. This paper provides results in both passive and active learning. In the passive setting, the learner aims at finding a CP-net compatible with a supplied set of examples, while in the active setting the learner searches for the cheapest interaction policy with the user for acquiring the target CP-net.
TL;DR: In this paper, the authors investigate the problem of learning ordinal preference orderings over discrete multiattribute, or combinatorial, domains, and focus on the learnability issue of conditional preference networks.
Abstract: A recurrent issue in decision making is to extract a preference structure by observing the user’s behavior in different situations. In this paper, we investigate the problem of learning ordinal preference orderings over discrete multiattribute, or combinatorial, domains. Specifically, we focus on the learnability issue of conditional preference networks, or CP-nets, that have recently emerged as a popular graphical language for representing ordinal preferences in a concise and intuitive manner. This paper provides results in both passive and active learning. In the passive setting, the learner aims at finding a CP-net compatible with a supplied set of examples, while in the active setting the learner searches for the cheapest interaction policy with the user for acquiring the target CP-net.
TL;DR: A case study is based on a case study that attempts to adapt user-centered design approach into course design in order to increase the effectiveness of teaching, and the learnability and success of university students.
TL;DR: This paper develops a generalized apprenticeship learning protocol for reinforcement-learning agents with access to a teacher who provides policy traces (transition and reward observations) and constructs efficient apprenticeship-learning algorithms in a number of domains, including two types of relational MDPs.
Abstract: This paper develops a generalized apprenticeship learning protocol for reinforcement-learning agents with access to a teacher who provides policy traces (transition and reward observations). We characterize sufficient conditions of the underlying models for efficient apprenticeship learning and link this criteria to two established learnability classes (KWIK and Mistake Bound). We then construct efficient apprenticeship-learning algorithms in a number of domains, including two types of relational MDPs. We instantiate our approach in a software agent and a robot agent that learn effectively from a human teacher.
TL;DR: Software animated demonstrations (SADs), which were developed in the early 1990s and which more recently have come to be known as 'screencasts', have generated new interest as a promising platform for computer users.
Abstract: Introduction The ever-growing importance of computer literacy has ensured the continuity of political and scientific interest in computer learning media and environments for over 30 years. Although human-computer interaction research has strived to facilitate the acquisition of software skills, firstly through the design and exploitation of interfaces that have clear intentions, anticipated semiotics, direct manipulations, real world metaphors, and various other exciting qualities (Shneiderman & Plaisant, 2005) and, secondly, by designating 'learnability' as one of the most fundamental usability attributes of software applications (Grossman, Fitzmaurice & Attar, 2009), most idiosyncratic problems related to computer learning remain unsolved. The "paradox of the active user" (Carroll & Rosson, 1987; Fu & Gray, 2004) endures since users persist in realizing tasks in an inefficient way even when demonstrably more efficient procedures exist. The occasional, rather than causal or premeditated, character of computer learning (Phelps, Hase, & Ellis, 2005), as well as various factors such as the impossibility of complete coverage of the computer knowledge domain, the inevitability of knowledge obsolescence (Eisenberg & Fischer, 1993) and the prevalence of unsupported exploratory learning from early on (Carroll & Rosson, 1987), constitute a complex and original learning ecology (Barron, 2004). Computer learning is a complex task that puts a tremendous burden on all computer users, experienced or not (Kiesler, Zdaniuk, Lundmark, & Kraut, 2000), and results in a multiplicity of 'frustrating experiences' and extensive time losses (Lazar, Jones & Shneiderman, 2006). Regardless of the multibillion investments in schools, universities, and corporate training activities (Corrall, 2008; Gupta & Bostrom, 2006), computer learning continues to be viewed as a personal landscape dominated by individual exploratory approaches, even for students and instructors of Computer Science Departments. Software animated demonstrations (SADs), which were developed in the early 1990s and which more recently have come to be known as 'screencasts', have generated new interest as a promising platform for computer users. Many commercial authoring tools, such as TechSmith Camtasia[C] and Adobe Captivate[C], and free authoring tools, such as Articulate ScreenR [C], TechSmith Jing [C]and Wink Screen Recorder[C] have been made available, while a new trend towards sharing screencasts in specially configured web 2.0 platforms (such as www.jingproject.com) or generic user generated video sites (such as www.youtube.com), have also started to gain momentum. SADs, in their primitive form, reproduce a screen-captured usage scenario of a software application. Their definition is usually differentiated on the basis of the presumed presenter or the scenario contents. In most cases, SADs resemble watching an instructor, an expert, a ghost user, a colleague, or even a student providing worked-out examples of software utilization. From a factual point of view, SADs constitute a unique tool for e-learning design, especially for Computer Science instructors, since they promote an easy and affordable way of producing multimedia instructional material that is authentic, situated, and motivating and can be exploited in various educational settings (in the classroom, self-paced, collaboratively, etc.) unlike other kinds of multimedia resources. Instructors, whose role in educational technology adoption has long been underestimated, desperately seek an accessible technology that enables the quick creation/development of software tutorials and that allows them to update the learning material frequently in order to keep up with the pace of software evolution. SADs production, with the use of the last generation of screencasting tools, requires almost a time-frame similar to the one needed for preparing a class demonstration plus the time required for recording it, or, in essence, realizing it for once. …
TL;DR: The present work initiates the study of the learnability of automatic indexable classes which are classes of regular languages of a certain form, where Angluin's tell-tale condition characterizes when these classes are explanatorily learnable.
Abstract: The present work initiates the study of the learnability of automatic indexable classes which are classes of regular languages of a certain form. Angluin's tell-tale condition characterizes when these classes are explanatorily learnable. Therefore, the more interesting question is when learnability holds for learners with complexity bounds, formulated in the automata-theoretic setting. The learners in question work iteratively, in some cases with an additional long-term memory, where the update function of the learner mapping old hypothesis, old memory and current datum to new hypothesis and new memory is automatic. Furthermore, the dependence of the learnability on the indexing is also investigated. This work brings together the fields of inductive inference and automatic structures.
TL;DR: The learnability of the LMS was high and providing assistance for first‐time users to get past the critical errors, rather than redesigning systems to accommodate low ICT skills, should be considered.
Abstract: Purpose – The purpose of this paper is to investigate the implications of usability and learnability in learning management systems (LMS) by considering the experiences of information and communications technology (ICT) experts and non‐experts in using the LMS of an open‐distance university.Design/methodology/approach – The paper uses task‐based usability testing augmented by eye tracking, post‐test questionnaires and interviews; and data captured by video recordings, eye tracking, post‐test questionnaires and interviews.Findings – Usability is critical in LMS where students’ ICT skills vary. The learnability of the LMS was high and providing assistance for first‐time users to get past the critical errors, rather than redesigning systems to accommodate low ICT skills, should be considered. Designing an LMS for novices may lead to a less efficient design for regular users.Research limitations/implications – Usability testing is limited to the LMS of one open‐distance university. ICT skills are identified a...
TL;DR: The authors identify clause-initial finite verbs as Co in Celtic as well as in Germanic and show that the acquisition data suggest that finiteness, not tense, must be the trigger for verb movement, rather than to be derived from uninterpretable phi-features.
TL;DR: It is shown that if q>3, then simple external contextual languages are not iteratively learnable using a class preserving one-one hypothesis space, while for q=1 it is iteratively learningable, even in polynomial time.
TL;DR: This paper demonstrates that a human being using an interface can be efficiently evaluated - in real time - by embedding basic measurements in the interface and using a suitable trained artificial neural network.
Abstract: This paper demonstrates that a human being using an interface can be efficiently evaluated - in real time - by embedding basic measurements in the interface and using a suitable trained artificial neural network. The approach is introduced through video games but is suitable for any machine capable of valuable measurements on user actions. Of course, the quality of the "diagnostic" depends of the learnability of the task and of the size and quality of the learning base. Typical applications include the detection of fatigue, stress, emotions, the influence of a drug or of medical treatments; screening a deficit or adequateness to a task, etc. Two successful prototypes are presented, one to predict the mental age of children through a set of simple basic games, and the other to detect if a subject is right-handed of left-handed through a racing car simulation.
TL;DR: This paper outlines and pilots the approach towards developing an inventory of verb-argument constructions based upon English form, function, and usage and develops measures of this using network measures of clustering in the verb-space defined by WordNet and Roget's Thesaurus.
Abstract: This paper outlines and pilots our approach towards developing an inventory of verb-argument constructions based upon English form, function, and usage. We search a tagged and dependency-parsed BNC (a 100-million word corpus of English) for Verb-Argument Constructions (VACs) including those previously identified in the pattern grammar resulting from the COBUILD project. This generates (1) a list of verb types that occupy each construction. We next tally the frequency profiles of these verbs to produce (2) a frequency ranked type-token distribution for these verbs, and we determine the degree to which this is Zipfian. Since some verbs are faithful to one construction while others are more promiscuous, we next produce (3) a contingency-weighted list reflecting their statistical association. To test whether each of these measures is a step towards increasing the learnability of VACs as categories, following principles of associative learning, we examine 20 verbs from each distribution. Here we explore whether there is an increase in the semantic cohesion of the verbs occupying each construction using semantic similarity measures. From inspection, this seems to be so. We are developing measures of this using network measures of clustering in the verb-space defined by WordNet and Roget's Thesaurus.
TL;DR: In this paper, the authors employ notions of learnability, self-enforceability, and properness to motivate and develop a suite of equilibrium selection criteria, including whether the Pareto-preferred equilibrium is learnable by private agents.
Abstract: Discretionary policymakers cannot manage private-sector expectations and cannot coordinate the actions of future policymakers. As a consequence, expectations traps and coordination failures can occur and multiple equilibria can arise. To utilize the explanatory power of models with multiple equilibria it is …rst necessary to understand how an economy arrives to a particular equilibrium. In this paper, we employ notions of learnability, self-enforceability, and properness to motivate and develop a suite of equilibrium selection criteria. Central among these criteria are whether the equilibrium is learnable by private agents and jointly learnable by private agents and the policymaker. We use two New Keynesian policy models to identify the strategic interactions that give rise to multiple equilibria and to illustrate our equilibrium selection methods. Importantly, unless the Pareto-preferred equilibrium is learnable by private agents, we …nd little reason to expect coordination on that equilibrium.
TL;DR: Comparing how children and adults learn to use an unfamiliar computer game to determine whether learnability has different meanings across generations will help designers to distinguish between the needs of users in different age groups and improve the learnability of their products.
Abstract: The learnability principle was originally formulated with computer-based applications mainly for adults in mind. In this paper we compare how children and adults learn to use an unfamiliar computer game to determine whether learnability has different meanings across generations. We recorded eye tracking data while users taught themselves to play a computer game. Comparison of the on-screen focus points and eye gazing patterns showed that adults and children have different tactics when confronted with an unfamiliar game. It revealed aspects of software interfaces that adults and children approach differently. For example, children will focus on the game elements and use a trial-and-error approach instead of reading on-screen instructions, while adults are more willing to interrupt game play to read the instructions. The knowledge gained through this research will help designers to distinguish between the needs of users in different age groups and improve the learnability of their products.
TL;DR: It is shown that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.
Abstract: In this paper, we consider the task of answering linear queries under the constraint of differential privacy. This is a general and well-studied class of queries that captures other commonly studied classes, including predicate queries and histogram queries. We show that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.
TL;DR: In this paper, the authors consider the problem of answering linear queries under the constraint of differential privacy and show that the accuracy to which a set of linear queries can be answered is closely related to their fat-shattering dimension.
Abstract: In this paper, we consider the task of answering linear queries under the constraint of differential privacy. This is a general and well-studied class of queries that captures other commonly studied classes, including predicate queries and histogram queries. We show that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.
TL;DR: An experiment focusing on early design tools used in product design and their integration in PLM context is presented, which aims to assess their usability, to evaluate and compare them and proposes a comparison based on four topics: learnability, satisfaction of users, efficiency and error correction.
TL;DR: Software design and social influences that can contribute to ease of learning are discussed, and the importance of 'learnability' for technology adoption and continued use is discussed.
Abstract: A number of different factors influence to a user's decision to adopt and continue using a technology. These include perceived ease of use (usability) and perceived usefulness (usefulness), among others. In this paper we examine another factor, ease of learning to use a technology, or 'learnability.' We discuss software design and social influences that can contribute to ease of learning, and discuss the importance of 'learnability' for technology adoption and continued use.
TL;DR: The experimental results support the possibility that linguistic constructions are acquired probabilistically from cognition-general principles, and a recently proposed practical framework is described, which quantifies natural language learnability.
Abstract: There is much debate over the degree to which language learning is governed by innate language-specific biases, or acquired through cognition-general principles. Here we examine the probabilistic language acquisition hypothesis on three levels: We outline a novel theoretical result showing that it is possible to learn the exact generative model underlying a wide class of languages, purely from observing samples of the language. We then describe a recently proposed practical framework, which quantifies natural language learnability, allowing specific learnability predictions to be made for the first time. In previous work, this framework was used to make learnability predictions for a wide variety of linguistic constructions, for which learnability has been much debated. Here, we present a new experiment which tests these learnability predictions. We find that our experimental results support the possibility that these linguistic constructions are acquired probabilistically from cognition-general principles.
TL;DR: This paper shows what are the properties and principles the semantic representation and grammar formalism require, in order to be able to learn these constraints from examples, and gives a learning algorithm.
Abstract: Lexicalized Weil-Founded Grammar (LWFG) is a recently developed syntactic-semantic grammar formalism for deep language understanding, which balances expressiveness with provable learnability results. The learnability result for LWFGs assumes that the semantic composition constraints are learnable. In this paper, we show what are the properties and principles the semantic representation and grammar formalism require, in order to be able to learn these constraints from examples, and give a learning algorithm. We also introduce a LWFG parser as a deductive system, used as an inference engine during LWFG induction. An example for learning a grammar for noun compounds is given.