TL;DR: This work presents a framework for gesture customization requiring minimal examples from users, all without degrading the performance of existing gesture sets, and designs a few-shot learning framework which derives a lightweight model from the authors' pre-trained model, enabling knowledge transfer without performance degradation.
Abstract: We present a framework for gesture customization requiring minimal examples from users, all without degrading the performance of existing gesture sets. To achieve this, we first deployed a large-scale study (N=500+) to collect data and train an accelerometer-gyroscope recognition model with a cross-user accuracy of 95.7% and a false-positive rate of 0.6 per hour when tested on everyday non-gesture data. Next, we design a few-shot learning framework which derives a lightweight model from our pre-trained model, enabling knowledge transfer without performance degradation. We validate our approach through a user study (N=20) examining on-device customization from 12 new gestures, resulting in an average accuracy of 55.3%, 83.1%, and 87.2% on using one, three, or five shots when adding a new gesture, while maintaining the same recognition accuracy and false-positive rate from the pre-existing gesture set. We further evaluate the usability of our real-time implementation with a user experience study (N=20). Our results highlight the effectiveness, learnability, and usability of our customization framework. Our approach paves the way for a future where users are no longer bound to pre-existing gestures, freeing them to creatively introduce new gestures tailored to their preferences and abilities.
TL;DR: This work evaluates a model’s acquisition of island constraints by demonstrating that its expectation for a filler–gap contingency is attenuated within an island environment, and provides empirical evidence against the Argument from the Poverty of the Stimulus for this particular structure.
Abstract:
We study the learnability of English filler–gap dependencies and the “island” constraints on them by assessing the generalizations made by autoregressive (incremental) language models that use deep learning to predict the next word given preceding context. Using factorial tests inspired by experimental psycholinguistics, we find that models acquire not only the basic contingency between fillers and gaps, but also the unboundedness and hierarchical constraints implicated in the dependency. We evaluate a model’s acquisition of island constraints by demonstrating that its expectation for a filler–gap contingency is attenuated within an island environment. Our results provide empirical evidence against the Argument from the Poverty of the Stimulus for this particular structure.
TL;DR: In this paper , the authors identify, explore, and summarize the current state of the literature on the usability evaluation of mHealth apps for older adults and to incorporate these methods into the appropriate evaluation stage.
Abstract: Usability is a key factor affecting the acceptance of mobile health applications (mHealth apps) for elderly individuals, but traditional usability evaluation methods may not be suitable for use in this population because of aging barriers. The objectives of this study were to identify, explore, and summarize the current state of the literature on the usability evaluation of mHealth apps for older adults and to incorporate these methods into the appropriate evaluation stage.Electronic searches were conducted in 10 databases. Inclusion criteria were articles focused on the usability evaluation of mHealth apps designed for older adults. The included studies were classified according to the mHealth app usability evaluation framework, and the suitability of evaluation methods for use among the elderly was analyzed.Ninety-six articles met the inclusion criteria. Research activity increased steeply after 2013 (n = 92). Satisfaction (n = 74) and learnability (n = 60) were the most frequently evaluated critical measures, while memorability (n = 13) was the least evaluated. The ratios of satisfaction, learnability, operability, and understandability measures were significantly related to the different stages of evaluation (P < 0.05). The methods used for usability evaluation were questionnaire (n = 68), interview (n = 36), concurrent thinking aloud (n = 25), performance metrics (n = 25), behavioral observation log (n = 14), screen recording (n = 3), eye tracking (n = 1), retrospective thinking aloud (n = 1), and feedback log (n = 1). Thirty-two studies developed their own evaluation tool to assess unique design features for elderly individuals.In the past five years, the number of studies in the field of usability evaluation of mHealth apps for the elderly has increased rapidly. The mHealth apps are often used as an auxiliary means of self-management to help the elderly manage their wellness and disease. According to the three stages of the mHealth app usability evaluation framework, the critical measures and evaluation methods are inconsistent. Future research should focus on selecting specific critical measures relevant to aging characteristics and adapting usability evaluation methods to elderly individuals by improving traditional tools, introducing automated evaluation tools and optimizing evaluation processes.
TL;DR: In this paper , the authors consider monitored quantum systems with a global conserved charge, and ask how efficiently an eavesdropper can learn the global charge of such systems from local projective measurements.
Abstract: We consider monitored quantum systems with a global conserved charge, and ask how efficiently an observer ("eavesdropper") can learn the global charge of such systems from local projective measurements. We find phase transitions as a function of the measurement rate, depending on how much information about the quantum dynamics the eavesdropper has access to. For random unitary circuits with U(1) symmetry, we present an optimal classical classifier to reconstruct the global charge from local measurement outcomes only. We demonstrate the existence of phase transitions in the performance of this classifier in the thermodynamic limit. We also study numerically improved classifiers by including some knowledge about the unitary gates pattern.
TL;DR: This paper presents a learnability enhancement strategy to reform paired real data according to noise modeling, which consists of two efficient techniques: shot noise augmentation (SNA) and dark shading correction (DSC).
Abstract: Low-light raw denoising is an important and valuable task in computational photography where learning-based methods trained with paired real data are mainstream. However, the limited data volume and complicated noise distribution have constituted a learnability bottleneck for paired real data, which limits the denoising performance of learning-based methods. To address this issue, we present a learnability enhancement strategy to reform paired real data according to noise modeling. Our strategy consists of two efficient techniques: shot noise augmentation (SNA) and dark shading correction (DSC). Through noise model decoupling, SNA improves the precision of data mapping by increasing the data volume and DSC reduces the complexity of data mapping by reducing the noise complexity. Extensive results on the public datasets and real imaging scenarios collectively demonstrate the state-of-the-art performance of our method.
TL;DR: This study followed Nielsen’s usability model to identify user requirements from five aspects, namely learnability, efficiency, memorability, error, and satisfaction, and found that speech recognition technology can help seniors access information quickly.
Abstract: The advancement of mobile technologies has motivated countries around the world to aim for smarter health management to support senior citizens. However, the use of mobile health applications (mHealth apps) among senior citizens appears to be low. Thus, drawing upon user expectations, the present study examined user requirements for a senior-friendly mHealth application. A total of 74 senior citizens were interviewed to explore the difficulties they encounter when using existing mobile apps. This study followed Nielsen’s usability model to identify user requirements from five aspects, namely learnability, efficiency, memorability, error, and satisfaction. Based on the results, a guideline was proposed pertaining to usability and health management features. This guideline offers suggestions for mHealth app issues related to phrasing, menus, simplicity, error messages, icons and buttons, navigation, and layout, among others. The study also found that speech recognition technology can help seniors access information quickly. The proposed guideline and findings offer valuable input for software and app developers in building more engaging and senior-friendly mHealth apps.
TL;DR: In this paper , the PAC learning theory of OOD detection was investigated and several necessary and sufficient conditions were given to characterize the learnability of the OOD detector in some practical scenarios.
Abstract: Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms. To study the generalization of OOD detection, in this paper, we investigate the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem. First, we find a necessary condition for the learnability of OOD detection. Then, using this condition, we prove several impossibility theorems for the learnability of OOD detection under some scenarios. Although the impossibility theorems are frustrating, we find that some conditions of these impossibility theorems may not hold in some practical scenarios. Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios. Lastly, we also offer theoretical supports for several representative OOD detection works based on our OOD theory.
TL;DR: In this paper , the authors compared two expert-based evaluation methods in order to assess a nursing module as the most widely used module of a hospital information system (HIS) in terms of the number and severity of recognized problems according to the usability attributes.
Abstract: Abstract Background There are differences of opinion regarding the selection of the most practical usability evaluation method among different methods. The present study aimed to compare two expert-based evaluation methods in order to assess a nursing module as the most widely used module of a Hospital Information System (HIS). Methods Five independent evaluators used the Heuristic Evaluation (HE) and Cognitive Walkthrough (CW) methods to evaluate the nursing module of Shafa HIS. In this regard, the number and severity of the recognized problems according to the usability attributes were compared using two evaluation methods. Results The HE and CW evaluation methods resulted in the identification of 104 and 24 unique problems, respectively, of which 33.3% of recognized problems in the CW evaluation method overlapped with the HE method. The average severity of the recognized problems was considered to be minor (2.34) in the HE method and major (2.77) in the CW evaluation method. There was a significant difference in terms of the total number and average severity of the recognized problems by these methods ( P < 0.001). Based on the usability attribute, the HE method identified a larger number of problems concerning all usability attributes, and a significant difference was observed in terms of the number of recognized problems in both methods for all attributes except ‘memorability’. Also, there was a significant difference between the two methods based on the average severity of recognized problems only in terms of ‘learnability’. Conclusion The HE method identified more problems with lower average severity while the CW was able to recognize fewer problems with higher average severity. Regarding the evaluation goal, the HE method was able to be used to improve the effectiveness and satisfaction of the HIS. Furthermore, the CW evaluation method is recommended to identify usability problems with the highest average severity, especially in terms of ‘learnability’.
TL;DR: In the social world, this way of working through complexity is especially important, given the extreme range of variability that exists across human individuals and communities, posing a particularly difficult learnability problem as mentioned in this paper .
Abstract: Throughout human history and across all human cultures, civilizations have organized themselves into social collectives, to the extent that it seems fair to say that social groups are the natural ecology of our species. In many ways, these groups play the same role as do categories in other domains; after all, the world is an incredibly complex place, and dividing it into categories is a powerful way to simplify this complexity and maximize efficiency in learning. In the social world, this way of working through complexity is especially important, given the extreme range of variability that exists across human individuals and communities. Children must navigate a world full of people with a range of properties that appear to have little in common with one another, posing a particularly difficult learnability problem. Social categorization allows children to work through this complexity by selecting features that denote meaningful differences between people (see Chapter 13). As a result, social categories become a fundamental lens through which we see the world.
TL;DR: In this paper , the authors train a neural network to choose Condorcet, Borda, and plurality winners, and show that when trained on a limited (but still reasonably large) sample, the neural network mimics most closely the Borda rule no matter on which rule it was previously trained.
TL;DR: This work proves that H can be PAC learned by an (approximate) differentially private algorithm if and only if it has a finite Littlestone dimension, implying a qualitative equivalence between online learnability and private PAC learnability.
Abstract: Let H be a binary-labeled concept class. We prove that H can be PAC learned by an (approximate) differentially private algorithm if and only if it has a finite Littlestone dimension. This implies a qualitative equivalence between online learnability and private PAC learnability.
TL;DR: In this article , the authors considered iterative learning control for a class of discrete-time systems with full learnability and unknown system dynamics, and developed data-based learning schemes to obtain all the needed information on system dynamics through the developed learning schemes if the control system is controllable.
Abstract: This article considers iterative learning control (ILC) for a class of discrete-time systems with full learnability and unknown system dynamics. First, we give a framework to analyze the learnability of the control system and build the relationship between the learnability of the control system and the input-output coupling matrix (IOCM). The control system has full learnability if and only if the IOCM is full-row rank and the control system has no learnability almost everywhere if and only if the rank of the IOCM is less than the dimension of system output. Second, by using the repetitiveness of the control system, some data-based learning schemes are developed. It is shown that we can obtain all the needed information on system dynamics through the developed learning schemes if the control system is controllable. Third, by the dynamic characteristics of system outputs of the ILC system along the iteration direction, we show how to use the available information of system dynamics to design the iterative learning gain matrix and the current state feedback gain matrix. And we strictly prove that the iterative learning scheme with the current state feedback mechanism can guarantee the monotone convergence of the ILC process if the IOCM is full-row rank. Finally, a numerical example is provided to validate the effectiveness of the proposed iterative learning scheme with the current state feedback mechanism.
TL;DR: The authors explored the learnability consequences of one of the striking commonalities between languages, finding that the greater predictability of words in this distribution can facilitate word segmentation, a crucial aspect of early language acquisition.
TL;DR: This study would address the usability issues students face while interacting with the UMIS platform provided for them by their institutions and also proposed a responsive and user-centered design which if implemented would improve students engagement on the platform and also reduce the constant problems that may arise from using theUMIS platform.
Abstract: University Management Information Systems (UMIS) are a very essential part of a school’s ecosystem. Trying to build a functional UMIS is no longer a serious issue, these days as students interact with this system to perform tasks such as course registration, school fee payment, etc., the ease at which they do these activities is extremely important, any error or confusing experience they come in contact with can make the process dreadful for these users and demotivate them. This study would be centered on designing the User Interface (UI) and improving the UX of University Management Information Systems for web-based interfaces. User-Centered Design processes and system design thinking methodology were employed to solve the problem. Questionnaires were used to obtain the users' pain points as it relates to the existing UMIS in their schools, the responses were analyzed to understand the users’ pain and issues they face with their current UMIS and then decipher the right features to create a more usable interface. User personas and wireframes were used to make sense of the data obtained from user research. Figma, a visual design and prototyping tool was used for the prototype and interface design. The newly created interfaces were subjected to user testing using a platform called Maze. Users were able to interact with the platform and then answer certain questions as it relates to the developed system. Test data was used to measure usability parameters such as efficiency, effectiveness, learnability, ease of use and simplicity. From the testing phase, the developed system has a System Usability Score (SUS) of 87, it shows that users enjoyed using the system and could navigate through a platform they are interacting with for the first time, with little to no help. it was discovered that users prefer a simpler, responsive, and more interactive interface. Also, users were able to successfully complete tasks even though it is an interface they had never interacted with before. This study would address the usability issues students face while interacting with the UMIS platform provided for them by their institutions and also proposed a responsive and user-centered design which if implemented would improve students engagement on the platform and also reduce the constant problems that may arise from using the UMIS platform.
TL;DR: This paper proposes a novel heterogeneous interactive snapshot network for stock profiling and recommendation, and introduces a novel twins-GRU method, which tightly couples the media and price parallel sequences in a cross-interactive fashion to catch dynamic dependencies between successive snapshots.
Abstract: Stock recommendation plays a critical role in modern quantitative trading. The large volumes of social media information such as investment reviews that delegate emotion-driven factors, together with price technical indicators formulate a “snapshot” of the evolving stock market profile. However, previous studies usually model the temporal trajectories of price and media modalities separately while losing their interrelated influences. Moreover, they mainly extract review semantics via sequential or attentive models, whereas the rich text associated knowledge is largely neglected. In this paper, we propose a novel heterogeneous interactive snapshot network for stock profiling and recommendation. We model investment reviews in each snapshot as a heterogeneous document graph, and develop a flexible hierarchical attentive propagation framework to capture fine-grained proximity features. Further, to learn stock embedding for ranking, we introduce a novel twins-GRU method, which tightly couples the media and price parallel sequences in a cross-interactive fashion to catch dynamic dependencies between successive snapshots. Our approach excels state-of-the-arts over 7.6% in terms of cumulative and risk-adjusted returns in trading simulations on both English and Chinese benchmarks.
TL;DR: This work analyzed 40 contemporary video games to identify how video games approach learning experiences and found that games have advanced far beyond using simple tutorials or demonstration screens and adopt a range of repeatable and reusable design strategies using visual cues to facilitate learning.
Abstract: Learnability is a core aspect of software usability. Video games are not an exception, as game designers need to teach players how to play their creations. We analyzed 40 contemporary video games to identify how video games approach learning experiences. We found that games have advanced far beyond using simple tutorials or demonstration screens and adopt a range of repeatable and reusable design strategies using visual cues to facilitate learning. We provide a detailed descriptive framework of these design strategies, elucidating how and when they can be used, and describing how the visual cues are used to build them. Our research can be useful for both general HCI researchers and practitioners seeking to tap into the rich ideas from video game learnability design looking for practical solutions for their work.
TL;DR: In this paper , an android based application Houzcalls is used for usability study using PACMAD usability model and the results have shown participants in FG1 have shown more Effectiveness, Efficiency, Satisfaction, Learnability, and Memorability.
Abstract: An application or product is considered “usable” if it is pleasing, easy to use, and works as expected user interface. Most companies majorly focus on the application’s functional requirements but put minimal effort into user experience (usability). Consumers’ adaptation of these applications depends on the number of features and user interface. In this work, an android based application, “Houzcalls” is used for usability study using PACMAD usability model. This work is focused on the variations in PACMAD attributes based on the participants’ education and age. Participants are segregated into two major groups FG1 and FG2 based on their education. All the participants with more than 10 years of education are in FG1 while others in FG2. Each focal group is divided into four subgroups Under 25, 25-35, 36-45, and Over 45 based on their ages. The results have shown participants in FG1 have shown more Effectiveness, Efficiency, Satisfaction, Learnability, and Memorability. In contrast, they have committed fewer Errors and shown less Cognitive Load during usability testing as compared to FG2. These variations can also be seen age-wise as generally “Under 25 and 25-35” subgroups have shown better results than other subgroups. It is inferred from the study that application usability and acceptability can be increased by considering the general population during development which includes all groups of people based on education and age.
TL;DR: In the era of digitalization, KAI finally transformed train ticket booking transactions by launching the KAI Access application by applying the optimal strategy of Game Theory to compete with marketplaces such as Traveloka.
Abstract: In the era of digitalization, KAI finally transformed train ticket booking transactions by launching the KAI Access application. By applying the optimal strategy of Game Theory, KAI Access will be able to compete with marketplaces such as Traveloka to be able to represent itself to the community as a credible company. Game Theory was chosen to analyze the existing competition to find the optimum strategy. Because both applications sell the same product, the strategy will be seen from the services provided by both applications using the usability of HCI. The 5 strategies are Learnability, Efficiency, Memorability, Errors, and Satisfaction. The best strategy used by KAI Access to obtain maximum profit is the low error rate (errors) strategy, while for Traveloka the strategy used to minimize loss is the ease of application (learnability).
TL;DR: IndexPen is introduced, a novel interaction technique for text input through two-finger in-air micro-gestures, enabling touch-free, effortless, tracking-based interaction, designed to mirror real-world writing.
Abstract: In this paper, we introduce IndexPen, a novel interaction technique for text input through two-finger in-air micro-gestures, enabling touch-free, effortless, tracking-based interaction, designed to mirror real-world writing. Our system is based on millimeter-wave radar sensing, and does not require instrumentation on the user. IndexPen can successfully identify 30 distinct gestures, representing the letters A-Z, as well as Space, Backspace, Enter, and a special Activation gesture to prevent unintentional input. Additionally, we include a noise class to differentiate gesture and non-gesture noise. We present our system design, including the radio frequency (RF) processing pipeline, classification model, and real-time detection algorithms. We further demonstrate our proof-of-concept system with data collected over ten days with five participants yielding 95.89% cross-validation accuracy on 31 classes (including noise). Moreover, we explore the learnability and adaptability of our system for real-world text input with 16 participants who are first-time users to IndexPen over five sessions. After each session, the pre-trained model from the previous five-user study is calibrated on the data collected so far for a new user through transfer learning. The F-1 score showed an average increase of 9.14% per session with the calibration, reaching an average of 88.3% on the last session across the 16 users. Meanwhile, we show that the users can type sentences with IndexPen at 86.2% accuracy, measured by string similarity. This work builds a foundation and vision for future interaction interfaces that could be enabled with this paradigm.
TL;DR: A usability design framework for m-government applications is proposed based on the user interface redesign of the EarthquakeTMD application and it was found that the citizens preferred the new user interfaces designed using the framework.
Abstract: The existing usability models have been used primarily for evaluation, not for usability engineering. The models were found to be general for specific mobile applications. They also lack appropriate guidelines to apply the usability models to m-government applications. Earthquake information is an example of critical information delivered to citizens via m-government applications. Usability design is considered a very important key factor to the success of such applications. This research addresses the challenges in finding the usability factors important to m-government applications and choosing appropriate factors for specific m-government applications. A questionnaire was administered to 49 citizens. The results include six usability factors which are learnability, simplicity, satisfaction, security, privacy, and memorability. Descriptions of the usability factors were later added to provide a clearer definition for each factor. This paper proposes the usability design framework for m-government applications. The use of the framework was illustrated based on the user interface redesign of the EarthquakeTMD application. The main aim was to demonstrate the applicability of the framework. The quality of the original UI design of the application in the case study was assessed with a questionnaire which was administered to 57 Thai citizens who lived in the areas affected by the disasters. Four designers participated in UI redesigning and produced four different UI designs. The new UI designs were evaluated via two usability tests on two sample groups of representative users. The first usability test was conducted with 24 participants. Twenty-four test cases were used. The second usability test was conducted with 351 representative users. After the tests, both sample groups were given a questionnaire based on the SUS (System Usability Scale). The same two UI designs by experienced and inexperienced designers who used the framework received the highest scores: 89.58 and 87.60 on the first usability test. They also received the highest score on the second usability test: 89.10 and 90.88. The results reveal that the citizens preferred the new user interfaces designed using the framework. It was found that the scores of the UI designed by inexperienced designers who used the framework were as high as the scores of the UI designed by experienced designers, whereas the UI designs from the designers who did not use the framework received the lowest scores: 63.23 and 54.27 on the first usability testing and 59.34 and 46.53 on the second usability testing.
TL;DR: In this article , the authors introduce an artificial language learning methodology to investigate the existence of universal constraints on person systems and report the results of three experiments that inform these theoretical approaches by generating behavioral evidence for the impact of constraints on the learnability of different person partitions.
Abstract: Person systems convey the roles entities play in the context of speech (e.g., speaker, addressee). As with other linguistic category systems, not all ways of partitioning the person space are equally likely crosslinguistically. Different theories have been proposed to constrain the set of possible person partitions that humans can represent, explaining their typological distribution. This article introduces an artificial language learning methodology to investigate the existence of universal constraints on person systems. We report the results of three experiments that inform these theoretical approaches by generating behavioral evidence for the impact of constraints on the learnability of different person partitions. Our findings constitute the first experimental evidence for learnability differences in this domain.
TL;DR: In this article , the authors examined the behavior of students towards the use of imported educational games and found that culture, language, animation, and interaction are contributing heavily to benefiting from educational games, and therefore these factors shall be highly considered in the process of educational games design to facilitate and ensure children learning.
Abstract: Educational games have been employed among Omani schools but those used by local Omani schools were imported and were mostly designed based on western contexts. For Omani children, these games may be culturally inappropriate and difficult to comprehend and follow, impeding children’s learning. Three questionnaires and one observational checklist were used to gather data from 40 respondents (observers). SPSS was used in data analysis. Through experiments, the behavior of Omani students towards the use of imported educational games was examined. Five main factors, namely, efficiency, learnability, memorability, errors, and satisfaction, of educational games for a target user were measured using Hybrid User Evaluation Methodology for Remote Evaluation (HUEMRE), Training Framework for Untrained Observer (TFUO), and Framework on Educational Games Behavior Intention (EGsBI), which are specifically designed frameworks for this purpose. The results of this study explained that the Omani children are facing difficulties in using the imported educational games; furthermore, this study proves that culture, language, animation, and interaction are contributing heavily to benefiting from educational games, and, therefore, these factors shall be highly considered in the process of educational games design to facilitate and ensure children learning; furthermore, the findings of this study enrich the comprehension of how the specified factors positively affect behavioral intention of Omani students in the use of educational games and in improving the behavior intention level of these students.
TL;DR: In this paper , the authors measured the performance and perceived workload of participants driving a robot through a pick-and-place task in Virtual Reality (VR) via controller buttons or physical actions.
Abstract: A valid Human-Robot Interaction (HRI) should be effective for the majority of the population. However, gender, gaming experience, or other individual factors are often likely to affect users' performance when interacting with a robot. In the present study, we measured the performance and perceived workload of participants driving a robot through a pick-and-place task in Virtual Reality (VR) via controller buttons or physical actions. The following individual factors were considered in the analysis: gaming experience, gender, learnability skills, problem solving and trust in technology. Results showed that all the accounted individual factors impacted either performance or perceived demand, but only when guiding the robot via controller buttons. Our findings foster the adoption of more natural ways of teleoperating robots, such as by physical actions, as they demonstrated to be exempt from the influence of individual factors, and are likely to be effective for a broader section of the population.
TL;DR: Göös et al. as mentioned in this paper show that the ERM principle fails to explain the learnability of partial concept classes, and demonstrate that the sample compression conjecture of Littlestone and Warmuth fails.
Abstract: We extend the classical theory of PAC learning in a way which allows to model a rich variety of practical learning tasks where the data satisfy special properties that ease the learning process. For example, tasks where the distance of the data from the decision boundary is bounded away from zero, or tasks where the data lie on a lower dimensional surface. The basic and simple idea is to consider partial concepts: these are functions that can be undefined on certain parts of the space. When learning a partial concept, we assume that the source distribution is supported only on points where the partial concept is defined. This way, one can naturally express assumptions on the data such as lying on a lower dimensional surface, or that it satisfies margin conditions. In contrast, it is not at all clear that such assumptions can be expressed by the traditional PAC theory using learnable total concept classes, and in fact we exhibit easy-to-learn partial concept classes which provably cannot be captured by the traditional PAC theory. This also resolves, in a strong negative sense, a question posed by Attias, Kontorovich, and Mansour (2019). We characterize PAC learnability of partial concept classes and reveal an algorithmic landscape which is fundamentally different than the classical one. For example, in the classical PAC model, learning boils down to Empirical Risk Minimization (ERM). This basic principle follows from Uniform Convergence and the Fundamental Theorem of PAC Learning (Vapnik and Chervonenkis, 1971, 1974b; Blumer, Ehrenfeucht, Haussler, and Warmuth, 1989; Hodges, 1993). In stark contrast, we show that the ERM principle fails spectacularly in explaining learnability of partial concept classes. In fact, we demonstrate classes that are incredibly easy to learn, but such that any algorithm that learns them must use an hypothesis space with unbounded VC dimension. We also find that the sample compression conjecture of Littlestone and Warmuth fails in this setting. Our impossibility results hinge on the recent breakthroughs in communication complexity and graph theory by Göös (2015); Ben-David, Hatami, and Tal (2017); Balodis, Ben-David, Göös, Jain, and Kothari (2021). Thus, this theory features problems that cannot be represented in the traditional way and cannot be solved in the traditional way. We view this as evidence that it might provide insights on the nature of learnability in realistic scenarios which the classical theory fails to explain. We include in the paper suggestions for future research and open problems in several contexts, including combinatorics, geometry, and learning theory.
TL;DR: In this article , the authors test the hypothesis that the extent to which a model is affected by an unseen textual perturbation (robustness) can be explained by the learnability of the perturbations.
Abstract: Modern Natural Language Processing (NLP) models are known to be sensitive to input perturbations and their performance can decrease when applied to real-world, noisy data. However, it is still unclear why models are less robust to some perturbations than others. In this work, we test the hypothesis that the extent to which a model is affected by an unseen textual perturbation (robustness) can be explained by the learnability of the perturbation (defined as how well the model learns to identify the perturbation with a small amount of evidence). We further give a causal justification for the learnability metric. We conduct extensive experiments with four prominent NLP models — TextRNN, BERT, RoBERTa and XLNet — over eight types of textual perturbations on three datasets. We show that a model which is better at identifying a perturbation (higher learnability) becomes worse at ignoring such a perturbation at test time (lower robustness), providing empirical support for our hypothesis.
TL;DR: IndexPen is introduced, a novel interaction technique for text input through two-finger in-air micro-gestures, enabling touch-free, effortless, tracking-based interaction, designed to mirror real-world writing.
Abstract: In this paper, we introduce IndexPen , a novel interaction technique for text input through two-finger in-air micro-gestures, enabling touch-free, effortless, tracking-based interaction, designed to mirror real-world writing. Our system is based on millimeter-wave radar sensing, and does not require instrumentation on the user. IndexPen can successfully identify 30 distinct gestures, representing the letters A-Z , as well as Space , Backspace , Enter , and a special Activation gesture to prevent unintentional input. Additionally, we include a noise class to differentiate gesture and non-gesture noise. We present our system design, including the radio frequency (RF) processing pipeline, classification model, and real-time detection algorithms. We further demonstrate our proof-of-concept system with data collected over ten days with five participants yielding 95.89% cross-validation accuracy on 31 classes (including noise ). Moreover, we explore the learnability and adaptability of our system for real-world text input with 16 participants who are first-time users to IndexPen over five sessions. After each session, the pre-trained model from the previous five-user study is calibrated on the data collected so far for a new user through transfer learning. The F-1 score showed an average increase of 9.14% per session with the calibration, reaching an average of 88.3% on the last session across the 16 users. Meanwhile, we show that the users can type sentences with IndexPen at 86.2% accuracy, measured by string similarity. This work builds a foundation and vision for future interaction interfaces that could be enabled with this paradigm.
TL;DR: In this paper , the authors argue that phonology offers a unique test case for distinguishing historical and cognitive influences on grammar, and propose an experimental technique for testing the cognitive factor which controls for the historical factor.
Abstract: Distinguishing cognitive influences from historical influences on human behavior has long been a disputed topic in behavioral sciences, including linguistics. The discussion is often complicated due to empirical evidence being consistent with both the cognitive and the historical approach. This article argues that phonology offers a unique test case for distinguishing historical and cognitive influences on grammar, and it proposes an experimental technique for testing the cognitive factor which controls for the historical factor. The article outlines a model called catalysis for explaining how learnability influences phonological typology and presents experiments that simulate this process. Central to this discussion are unnatural phonological processes, that is, those that operate against universal phonetic tendencies and require complex historical trajectories in order to arise. By using statistical methods for estimating historical influences, mismatches in predictions between the cognitive and historical approaches to typology can be identified. By conducting artificial grammar learning experiments on processes for which the historical approach makes predictions that differ from those of the cognitive approach, the experimental technique proposed in this article controls for historical influences while testing cognitive factors. Results of online and fieldwork experiments on two languages, English and Slovenian, show that subjects prefer postnasal devoicing over postnasal fricative occlusion and devoicing in at least a subset of places of articulation, which aligns with the observed typology. The advantage of the proposed approach over existing experimental work is that it experimentally confirms a link between synchronic preferences and typology that is most likely not influenced by historical biases. Results suggest that complexity avoidance is the primary influence cognitive bias has on phonological systems in human languages. Applying this technique to further alternations should yield new information about those cognitive properties of phonological grammar that are not conflated with historical influences.
TL;DR: The authors explore non-linguistic inputs in the form of multimodal stimuli and multi-agent interaction as ways to make our learners more efficient at learning from limited linguistic input.
Abstract: Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. However, the learning environments and biases of current artificial learners and humans diverge in ways that weaken the impact of the evidence obtained from learning simulations. For example, today's most effective neural language models are trained on roughly one thousand times the amount of linguistic data available to a typical child. To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans. If an appropriate model successfully acquires some target linguistic knowledge, it can provide a proof of concept that the target is learnable in a hypothesized human learning scenario. Plausible model learners will enable us to carry out experimental manipulations to make causal inferences about variables in the learning environment, and to rigorously test poverty-of-the-stimulus-style claims arguing for innate linguistic knowledge in humans on the basis of speculations about learnability. Comparable experiments will never be possible with human subjects due to practical and ethical considerations, making model learners an indispensable resource. So far, attempts to deprive current models of unfair advantages obtain sub-human results for key grammatical behaviors such as acceptability judgments. But before we can justifiably conclude that language learning requires more prior domain-specific knowledge than current models possess, we must first explore non-linguistic inputs in the form of multimodal stimuli and multi-agent interaction as ways to make our learners more efficient at learning from limited linguistic input.
TL;DR: In this article , it was shown that a hypothesis class is k-list learnable if and only if the k-DS dimension is finite, i.e., it is possible to learn a list of k predictions.
Abstract: A classical result in learning theory shows the equivalence of PAC learnability of binary hypothesis classes and the finiteness of VC dimension. Extending this to the multiclass setting was an open problem, which was settled in a recent breakthrough result characterizing multiclass PAC learnability via the DS dimension introduced earlier by Daniely and Shalev-Shwartz. In this work we consider list PAC learning where the goal is to output a list of k predictions. List learning algorithms have been developed in several settings before and indeed, list learning played an important role in the recent characterization of multiclass learnability. In this work we ask: when is it possible to k-list learn a hypothesis class? We completely characterize k-list learnability in terms of a generalization of DS dimension that we call the k-DS dimension. Generalizing the recent characterization of multiclass learnability, we show that a hypothesis class is k-list learnable if and only if the k-DS dimension is finite.