TL;DR: In this paper, the authors propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure.
Abstract: Transferring the knowledge learned from large scale datasets (e.g., ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) tasks (e.g., recognizing bird species or car make & model). In such scenarios, data annotation often calls for specialized domain knowledge and thus is difficult to scale. In this work, we first tackle a problem in large scale FGVC. Our method won first place in iNaturalist 2017 large scale species classification challenge. Central to the success of our approach is a training scheme that uses higher image resolution and deals with the long-tailed distribution of training data. Next, we study transfer learning via fine-tuning from large scale datasets to small scale, domain-specific FGVC datasets. We propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure. Our proposed transfer learning outperforms ImageNet pre-training and obtains state-of-the-art results on multiple commonly used FGVC datasets.
TL;DR: Through arming the DNN with better capability of harnessing both the feature and the class relationships, the proposed regularized DNN (rDNN) is more suitable for modeling video semantics.
Abstract: In this paper, we study the challenging problem of categorizing videos according to high-level semantics such as the existence of a particular human action or a complex event. Although extensive efforts have been devoted in recent years, most existing works combined multiple video features using simple fusion strategies and neglected the utilization of inter-class semantic relationships. This paper proposes a novel unified framework that jointly exploits the feature relationships and the class relationships for improved categorization performance. Specifically, these two types of relationships are estimated and utilized by imposing regularizations in the learning process of a deep neural network (DNN). Through arming the DNN with better capability of harnessing both the feature and the class relationships, the proposed regularized DNN (rDNN) is more suitable for modeling video semantics. We show that rDNN produces better performance over several state-of-the-art approaches. Competitive results are reported on the well-known Hollywood2 and Columbia Consumer Video benchmarks. In addition, to stimulate future research on large scale video categorization, we collect and release a new benchmark dataset, called FCVID, which contains 91,223 Internet videos and 239 manually annotated categories.
TL;DR: RotationNet as discussed by the authors takes multi-view images of an object as input and jointly estimates its pose and object category, which is useful in practical scenarios where only partial views are available.
Abstract: We propose a Convolutional Neural Network (CNN)-based model "RotationNet," which takes multi-view images of an object as input and jointly estimates its pose and object category. Unlike previous approaches that use known viewpoint labels for training, our method treats the viewpoint labels as latent variables, which are learned in an unsupervised manner during the training using an unaligned object dataset. RotationNet is designed to use only a partial set of multi-view images for inference, and this property makes it useful in practical scenarios where only partial views are available. Moreover, our pose alignment strategy enables one to obtain view-specific feature representations shared across classes, which is important to maintain high accuracy in both object categorization and pose estimation. Effectiveness of RotationNet is demonstrated by its superior performance to the state-of-the-art methods of 3D object classification on 10- and 40-class ModelNet datasets. We also show that RotationNet, even trained without known poses, achieves the state-of-the-art performance on an object pose estimation dataset.
TL;DR: The results provide an understanding of the relative difficulty of the scenarios and that simple baselines (Adagrad, L2 regularization, and naive rehearsal strategies) can surprisingly achieve similar performance to current mainstream methods.
Abstract: Continual learning has received a great deal of attention recently with several approaches being proposed. However, evaluations involve a diverse set of scenarios making meaningful comparison difficult. This work provides a systematic categorization of the scenarios and evaluates them within a consistent framework including strong baselines and state-of-the-art methods. The results provide an understanding of the relative difficulty of the scenarios and that simple baselines (Adagrad, L2 regularization, and naive rehearsal strategies) can surprisingly achieve similar performance to current mainstream methods. We conclude with several suggestions for creating harder evaluation scenarios and future research directions. The code is available at this https URL
TL;DR: This work proposes a measure to estimate domain similarity via Earth Mover's Distance and demonstrates that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure.
Abstract: Transferring the knowledge learned from large scale datasets (e.g., ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) tasks (e.g., recognizing bird species or car make and model). In such scenarios, data annotation often calls for specialized domain knowledge and thus is difficult to scale. In this work, we first tackle a problem in large scale FGVC. Our method won first place in iNaturalist 2017 large scale species classification challenge. Central to the success of our approach is a training scheme that uses higher image resolution and deals with the long-tailed distribution of training data. Next, we study transfer learning via fine-tuning from large scale datasets to small scale, domain-specific FGVC datasets. We propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure. Our proposed transfer learning outperforms ImageNet pre-training and obtains state-of-the-art results on multiple commonly used FGVC datasets.
TL;DR: A number of outstanding questions that need to be addressed on this topic are identified and next steps for the field are suggested.
Abstract: Sound symbolism refers to an association between phonemes and stimuli containing particular perceptual and/or semantic elements (e.g., objects of a certain size or shape). Some of the best-known examples include the mil/mal effect (Sapir, Journal of Experimental Psychology, 12, 225–239, 1929) and the maluma/takete effect (Kohler, 1929). Interest in this topic has been on the rise within psychology, and studies have demonstrated that sound symbolic effects are relevant for many facets of cognition, including language, action, memory, and categorization. Sound symbolism also provides a mechanism by which words’ forms can have nonarbitrary, iconic relationships with their meanings. Although various proposals have been put forth for how phonetic features (both acoustic and articulatory) come to be associated with stimuli, there is as yet no generally agreed-upon explanation. We review five proposals: statistical co-occurrence between phonetic features and associated stimuli in the environment, a shared property among phonetic features and stimuli; neural factors; species-general, evolved associations; and patterns extracted from language. We identify a number of outstanding questions that need to be addressed on this topic and suggest next steps for the field.
TL;DR: This work provides a comprehensive survey and novel categorization of the feature selection techniques that have been created for the multi- label classification setting and concludes with concrete suggestions for future research in multi-label feature selection.
Abstract: In many important application domains such as text categorization, biomolecular analysis, scene classification and medical diagnosis, examples are naturally associated with more than one class label, giving rise to multi-label classification problems. This fact has led, in recent years, to a substantial amount of research on feature selection methods that allow the identification of relevant and informative features for multi-label classification. However, the methods proposed for this task are scattered in the literature, with no common framework to describe them and to allow an objective comparison. Here, we revisit a categorization of existing multi-label classification methods and, as our main contribution, we provide a comprehensive survey and novel categorization of the feature selection techniques that have been created for the multi-label classification setting. We conclude this work with concrete suggestions for future research in multi-label feature selection which have been derived from our categorization and analysis.
TL;DR: The results indicate that the VMPFC and portions of the hippocampus play a broad role in memory generalization and that they do so by representing abstract information integrated from multiple events, contributing novel evidence of generalized concept representations in the brain.
Abstract: Memory function involves both the ability to remember details of individual experiences and the ability to link information across events to create new knowledge Prior research has identified the ventromedial prefrontal cortex (VMPFC) and the hippocampus as important for integrating across events in the service of generalization in episodic memory The degree to which these memory integration mechanisms contribute to other forms of generalization, such as concept learning, is unclear The present study used a concept-learning task in humans (both sexes) coupled with model-based fMRI to test whether VMPFC and hippocampus contribute to concept generalization, and whether they do so by maintaining specific category exemplars or abstract category representations Two formal categorization models were fit to individual subject data: a prototype model that posits abstract category representations and an exemplar model that posits category representations based on individual category members Latent variables from each of these models were entered into neuroimaging analyses to determine whether VMPFC and the hippocampus track prototype or exemplar information during concept generalization Behavioral model fits indicated that almost three-quarters of the subjects relied on prototype information when making judgments about new category members Paralleling prototype dominance in behavior, correlates of the prototype model were identified in VMPFC and the anterior hippocampus with no significant exemplar correlates These results indicate that the VMPFC and portions of the hippocampus play a broad role in memory generalization and that they do so by representing abstract information integrated from multiple eventsSIGNIFICANCE STATEMENT Whether people represent concepts as a set of individual category members or by deriving generalized concept representations abstracted across exemplars has been debated In episodic memory, generalized memory representations have been shown to arise through integration across events supported by the ventromedial prefrontal cortex (VMPFC) and hippocampus The current study combined formal categorization models with fMRI data analysis to show that the VMPFC and anterior hippocampus represent abstract prototype information during concept generalization, contributing novel evidence of generalized concept representations in the brain Results indicate that VMPFC-hippocampal memory integration mechanisms contribute to knowledge generalization across multiple cognitive domains, with the degree of abstraction of memory representations varying along the long axis of the hippocampus
TL;DR: An approach to learn and combine multimodal data representations for music genre classification is proposed, and a proposed approach for dimensionality reduction of target labels yields major improvements in multi-label classification.
Abstract: Music genre labels are useful to organize songs, albums, and artists into broader groups that share similar musical characteristics In this work, an approach to learn and combine multimodal data representations for music genre classification is proposed Intermediate representations of deep neural networks are learned from audio tracks, text reviews, and cover art images, and further combined for classification Experiments on single and multi-label genre classification are then carried out, evaluating the effect of the different learned representations and their combinations Results on both experiments show how the aggregation of learned representations from different modalities improves the accuracy of the classification, suggesting that different modalities embed complementary information In addition, the learning of a multimodal feature space increases the performance of pure audio representations, which may be specially relevant when the other modalities are available for training, but not at prediction time Moreover, a proposed approach for dimensionality reduction of target labels yields major improvements in multi-label classification not only in terms of accuracy, but also in terms of the diversity of the predicted genres, which implies a more fine-grained categorization Finally, a qualitative analysis of the results sheds some light on the behavior of the different modalities on the classification task
TL;DR: Three application areas are pointed out: optimal design and operation of flexible processes using demand and price forecasts, sustainability analysis and process design using hybrid methods, and accounting for the feedback effects of breakthrough technologies.
Abstract: Energy is a key driver of the modern economy, therefore modeling and simulation of energy systems has received significant research attention. We review the major developments in this area and propose two ways to categorize the diverse contributions. The first categorization is according to the modeling approach, namely into computational, mathematical, and physical models. With this categorization, we highlight certain novel hybrid approaches that combine aspects of the different groups proposed. The second categorization is according to field namely Process Systems Engineering (PSE) and Energy Economics (EE). We use the following criteria to illustrate the differences: the nature of variables, theoretical underpinnings, level of technological aggregation, spatial and temporal scales, and model purposes. Traditionally, the Process Systems Engineering approach models the technological characteristics of the energy system endogenously. However, the energy system is situated in a broader economic context that includes several stakeholders both within the energy sector and in other economic sectors. Complex relationships and feedback effects exist between these stakeholders, which may have a significant impact on strategic, tactical, and operational decision-making. Leveraging the expertise built in the Energy Economics field on modeling these complexities may be valuable to process systems engineers. With this categorization, we present the interactions between the two fields, and make the case for combining the two approaches. We point out three application areas: (1) optimal design and operation of flexible processes using demand and price forecasts, (2) sustainability analysis and process design using hybrid methods, and (3) accounting for the feedback effects of breakthrough technologies. These three examples highlight the value of combining Process Systems Engineering and Energy Economics models to get a holistic picture of the energy system in a wider economic and policy context.
TL;DR: This survey aims to provide a comprehensive survey of the online machine learning literatures through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques.
Abstract: Online learning represents an important family of machine learning algorithms, in which a learner attempts to resolve an online prediction (or any type of decision-making) task by learning a model/hypothesis from a sequence of data instances one at a time. The goal of online learning is to ensure that the online learner would make a sequence of accurate predictions (or correct decisions) given the knowledge of correct answers to previous prediction or learning tasks and possibly additional information. This is in contrast to many traditional batch learning or offline machine learning algorithms that are often designed to train a model in batch from a given collection of training data instances. This survey aims to provide a comprehensive survey of the online machine learning literatures through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques. Generally speaking, according to the learning type and the forms of feedback information, the existing online learning works can be classified into three major categories: (i) supervised online learning where full feedback information is always available, (ii) online learning with limited feedback, and (iii) unsupervised online learning where there is no feedback available. Due to space limitation, the survey will be mainly focused on the first category, but also briefly cover some basics of the other two categories. Finally, we also discuss some open issues and attempt to shed light on potential future research directions in this field.
TL;DR: It is demonstrated how mouse-tracking can further the theoretical understanding by highlighting research in two domains - social categorization and self-control.
TL;DR: A new deep FGVC model termed MetaFGNet is proposed, based on a novel regularized meta-learning objective, which aims to guide the learning of network parameters so that they are optimal for adapting to the target FGVC task.
Abstract: Fine-grained visual categorization (FGVC) is challenging due in part to the fact that it is often difficult to acquire an enough number of training samples. To employ large models for FGVC without suffering from overfitting, existing methods usually adopt a strategy of pre-training the models using a rich set of auxiliary data, followed by fine-tuning on the target FGVC task. However, the objective of pre-training does not take the target task into account, and consequently such obtained models are suboptimal for fine-tuning. To address this issue, we propose in this paper a new deep FGVC model termed MetaFGNet. Training of MetaFGNet is based on a novel regularized meta-learning objective, which aims to guide the learning of network parameters so that they are optimal for adapting to the target FGVC task. Based on MetaFGNet, we also propose a simple yet effective scheme for selecting more useful samples from the auxiliary data. Experiments on benchmark FGVC datasets show the efficacy of our proposed method.
TL;DR: A new semi-supervised method for learning via web data that has the unique design of exploiting strong supervision, i.e., in addition to standard image-level labels, the method also utilizes detailed annotations including object bounding boxes and part landmarks.
Abstract: Learning visual representations from web data has recently attracted attention for object recognition. Previous studies have mainly focused on overcoming label noise and data bias and have shown promising results by learning directly from web data. However, we argue that it might be better to transfer knowledge from existing human labeling resources to improve performance at nearly no additional cost. In this paper, we propose a new semi-supervised method for learning via web data. Our method has the unique design of exploiting strong supervision, i.e., in addition to standard image-level labels, our method also utilizes detailed annotations including object bounding boxes and part landmarks. By transferring as much knowledge as possible from existing strongly supervised datasets to weakly supervised web images, our method can benefit from sophisticated object recognition algorithms and overcome several typical problems found in webly-supervised learning. We consider the problem of fine-grained visual categorization, in which existing training resources are scarce, as our main research objective. Comprehensive experimentation and extensive analysis demonstrate encouraging performance of the proposed approach, which, at the same time, delivers a new pipeline for fine-grained visual categorization that is likely to be highly effective for real-world applications.
TL;DR: This work proposes an approach which can effectively utilize the data in the upper levels to contribute the categorization in the lower levels by applying the Convolutional Neural Network with a fine-tuning technique.
Abstract: We focus on the multi-label categorization task for short texts and explore the use of a hierarchical structure (HS) of categories In contrast to the existing work using non-hierarchical flat model, the method leverages the hierarchical relations between the pre-defined categories to tackle the data sparsity problem The lower the HS level, the less the categorization performance Because the number of training data per category in a lower level is much smaller than that in an upper level We propose an approach which can effectively utilize the data in the upper levels to contribute the categorization in the lower levels by applying the Convolutional Neural Network (CNN) with a fine-tuning technique The results using two benchmark datasets show that proposed method, Hierarchical Fine-Tuning based CNN (HFT-CNN) is competitive with the state-of-the-art CNN based methods
TL;DR: A new evidential dynamical (ED) model based on Dempster–Shafer (D-S) evidence theory and quantum dynamical modelling is proposed and an inspiring dynamical decision making framework is proposed in this paper.
Abstract: Categorization is necessary for many decision making tasks. However, the categorization process may interfere the decision making result and bring about the disjunction fallacy. To predict the interference effect of categorization, some models based on quantum cognition theory have been proposed. In quantum dynamical models, like the quantum belief-action entanglement (BAE) model, actions and beliefs are deemed to be entangled. However, the entanglement degree is an artificially defined parameter. In this paper, a new evidential dynamical (ED) model based on Dempster–Shafer (D-S) evidence theory and quantum dynamical modelling is proposed. Considering that sometimes people hesitate to make a decision, it is reasonable to extend the action states by introducing an uncertain state. In an evidential framework, categorization can influence the uncertain state in actions. The interference effect is measured by handling the uncertain state while no extra parameter is defined artificially. The proposed model is applied to the classical categorization decision-making experiments. Compared with the existing models, the number of free parameters in the ED model is less than the classical quantum models, and the ED model is more rational and simpler than an evidential Markov model. The model application results and discussions show the correctness and effectiveness of the ED model. Not only the interference effect of categorization on decision making results is explained and predicted, but also an inspiring dynamical decision making framework is proposed in this paper. We believe that the proposed ED model will bring more opportunities and will result in more applications in the future.
TL;DR: An exhaustive overview of measures used in current research is given to categorize these methods along measurement level (physiological, behavioral, and cognitive) and emotional processing level (unconscious sensory, perceptual/early cognitive, and conscious/decision making) level to help researchers to compile a set of complementary measures (“toolbox”) for their studies.
TL;DR: It is found that human observers take attention-dependent uncertainty into account when categorizing visual stimuli and reporting their confidence in a task in which uncertainty is relevant for performance, and these decisions take into account uncertainty related to attention in an approximately Bayesian fashion.
Abstract: Perceptual decisions are better when they take uncertainty into account. Uncertainty arises not only from the properties of sensory input but also from cognitive sources, such as different levels of attention. However, it is unknown whether humans appropriately adjust for such cognitive sources of uncertainty during perceptual decision-making. Here we show that, in a task in which uncertainty is relevant for performance, human categorization and confidence decisions take into account uncertainty related to attention. We manipulated uncertainty in an orientation categorization task from trial to trial using only an attentional cue. The categorization task was designed to disambiguate decision rules that did or did not depend on attention. Using formal model comparison to evaluate decision behavior, we found that category and confidence decision boundaries shifted as a function of attention in an approximately Bayesian fashion. This means that the observer’s attentional state on each trial contributed probabilistically to the decision computation. This responsiveness of an observer’s decisions to attention-dependent uncertainty should improve perceptual decisions in natural vision, in which attention is unevenly distributed across a scene.
TL;DR: Mapping elucidated the literature behavior through three phases and showed an increase in publications with applications in recent years, indicating that most studies address evaluations in the agriculture and farming, banking and energy sectors and consider the facilities as transition elements between analysis periods.
TL;DR: It is suggested that speech development is a protracted process in which children’s increasing sensitivity to within-category detail in the signal enables increasingly sharp phonetic categories.
Abstract: The development of the ability to categorize speech sounds is often viewed as occurring primarily during infancy via perceptual learning mechanisms. However, a number of studies suggest that even after infancy, children's categories become more categorical and well defined through about age 12. We investigated the cognitive changes that may be responsible for such development using a visual world paradigm experiment based on (McMurray, Tanenhaus, & Aslin, 2002). Children from 3 age groups (7-8, 12-13, and 17-18 years) heard a token from either a b/p or s/∫ continua spanning 2 words (beach/peach, ship/sip) and selected its referent from a screen containing 4 pictures of potential lexical candidates. Eye movements to each object were monitored as a measure of how strongly children were committing to each candidate as perception unfolds in real-time. Results showed an ongoing sharpening of speech categories through 18, which was particularly apparent during the early stages of real-time perception. When analysis targeted to specifically within-category sensitivity to continuous detail, children exhibited increasingly gradient categories over development, suggesting that increasing sensitivity to fine-grained detail in the signal enables these more discrete categorizations. Together these suggest that speech development is a protracted process in which children's increasing sensitivity to within-category detail in the signal enables increasingly sharp phonetic categories. (PsycINFO Database Record
TL;DR: This study established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies with high throughput and accuracy to characterize the cortical organization and representations of visual features for rapid categorization.
Abstract: The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
TL;DR: This paper proposes a novel deep learning framework for domain generalization that exploits recent advances in deep domain adaptation and design a convolutional neural network architecture with novel layers performing a weighted version of batch normalization.
Abstract: Traditional place categorization approaches in robot vision assume that training and test images have similar visual appearance. Therefore, any seasonal, illumination, and environmental changes typically lead to severe degradation in performance. To cope with this problem, recent works have been proposed to adopt domain adaptation techniques. While effective, these methods assume that some prior information about the scenario where the robot will operate is available at training time. Unfortunately, in many cases, this assumption does not hold, as we often do not know where a robot will be deployed. To overcome this issue, in this paper, we present an approach that aims at learning classification models able to generalize to unseen scenarios. Specifically, we propose a novel deep learning framework for domain generalization. Our method develops from the intuition that, given a set of different classification models associated to known domains (e.g., corresponding to multiple environments, robots), the best model for a new sample in the novel domain can be computed directly at test time by optimally combining the known models. To implement our idea, we exploit recent advances in deep domain adaptation and design a convolutional neural network architecture with novel layers performing a weighted version of batch normalization. Our experiments, conducted on three common datasets for robot place categorization, confirm the validity of our contribution.
TL;DR: This article reviewed research on multiracial identity and perceptions of multi-acial individuals as two domains where researchers have documented evidence of the flexible nature of social identities and social categorization, and they provided evidence that studying multiiracial perceivers and targets helps reveal that race changes across situations, time, and depending on a number of top-down factors (e.g., expectations, stereotypes, and cultural norms).
Abstract: Soc Personal Psychol Compass. 2018;1–15. w Abstract The majority of social perception research to date has focused on perceptually obvious and prototypical representations of social categories. However, not all people belong to social categories that are easily discernable. Within the past decade, there has been an upsurge of research demonstrating that multifaceted identities (both one's own and perceptions of others' identities) influence people to think about social categories in a more flexible manner. Here, we specifically review research on multiracial identity and perceptions of multiracial individuals as 2 domains where researchers have documented evidence of the flexible nature of social identities and social categorization. Integrating frameworks that argue race is a dynamic and interactive process, we provide evidence that studying multiracial perceivers and targets helps reveal that race changes across situations, time, and depending on a number of top‐down factors (e.g., expectations, stereotypes, and cultural norms). From the perspective of multiracial individuals as perceivers, we review research showing that flexible identity in multiracial individuals influences the process of social perception driven by a reduced belief in the essential nature of racial categories. From the perspective of multiracial individuals as targets, we review research that top‐down cues influence the racial categorization process. We further discuss emerging work that reveals that exposure to multiracial individuals influences beliefs surrounding the categorical (or noncategorical) nature of race, itself. Needed directions for future work are discussed.
TL;DR: A framework is proposed for question classification using a grammar-based approach (GQCC) which exploits the structure of the questions and results show that the GQCC using J48 classifier has outperformed other classification methods with 90.1% accuracy.
Abstract: Question-answering has become one of the most popular information retrieval applications. Despite that most question-answering systems try to improve the user experience and the technology used in finding relevant results, many difficulties are still faced because of the continuous increase in the amount of web content. Questions Classification (QC) plays an important role in question-answering systems, with one of the major tasks in the enhancement of the classification process being the identification of questions types. A broad range of QC approaches has been proposed with the aim of helping to find a solution for the classification problems; most of these are approaches based on bag-of-words or dictionaries. In this research, we present an analysis of the different type of questions based on their grammatical structure. We identify different patterns and use machine learning algorithms to classify them. A framework is proposed for question classification using a grammar-based approach (GQCC) which exploits the structure of the questions. Our findings indicate that using syntactic categories related to different domain-specific types of Common Nouns, Numeral Numbers and Proper Nouns enable the machine learning algorithms to better differentiate between different question types. The paper presents a wide range of experiments the results show that the GQCC using J48 classifier has outperformed other classification methods with 90.1% accuracy.
TL;DR: This paper investigates both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works, and argues that such categorization can be applied to multi-cloud environments using the same machine learning techniques.
Abstract: Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we investigate both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works. We have used a popular publicly available dataset to build and test learning models for both detection and categorization of different attacks. To be precise, we have used two supervised machine learning techniques, namely linear regression (LR) and random forest (RF). We show that even if detection is perfect, categorization can be less accurate due to similarities between attacks. Our results demonstrate more than 99% detection accuracy and categorization accuracy of 93.6%, with the inability to categorize some attacks. Further, we argue that such categorization can be applied to multi-cloud environments using the same machine learning techniques.
TL;DR: It is suggested that acoustic categorization may precede attribution of emotion, highlighting the need to distinguish between the overt form of nonverbal signals and their interpretation by the perceiver.
Abstract: Recent research on human nonverbal vocalizations has led to considerable progress in our understanding of vocal communication of emotion. However, in contrast to studies of animal vocalizations, this research has focused mainly on the emotional interpretation of such signals. The repertoire of human nonverbal vocalizations as acoustic types, and the mapping between acoustic and emotional categories, thus remain underexplored. In a cross-linguistic naming task (Experiment 1), verbal categorization of 132 authentic (non-acted) human vocalizations by English-, Swedish- and Russian-speaking participants revealed the same major acoustic types: laugh, cry, scream, moan, and possibly roar and sigh. The association between call type and perceived emotion was systematic but non-redundant: listeners associated every call type with a limited, but in some cases relatively wide, range of emotions. The speed and consistency of naming the call type predicted the speed and consistency of inferring the caller’s emotion, suggesting that acoustic and emotional categorizations are closely related. However, participants preferred to name the call type before naming the emotion. Furthermore, nonverbal categorization of the same stimuli in a triad classification task (Experiment 2) was more compatible with classification by call type than by emotion, indicating the former’s greater perceptual salience. These results suggest that acoustic categorization may precede attribution of emotion, highlighting the need to distinguish between the overt form of nonverbal signals and their interpretation by the perceiver. Both within- and between-call acoustic variation can then be modeled explicitly, bringing research on human nonverbal vocalizations more in line with the work on animal communication.
TL;DR: This paper uses neural language models to produce word embeddings from large quantities of publicly available product data marked up with Microdata, which boost the performance of the feature extraction model, thus leading to better product matching and categorization performances.
Abstract: Consumers today have the option to purchase products from thousands of e-shops. However, the completeness of the product specifications and the taxonomies used for organizing the products differ across different e-shops. To improve the consumer experience, approaches for product integration on the Web are needed. In this paper, we present an approach that leverages deep learning techniques in combination with standard classification approaches for product matching and categorization. In our approach we use structured product data as supervision for training feature extraction models able to extract attribute-value pairs from textual product descriptions. To minimize the need for lots of data for supervision, we use neural language models to produce word embeddings from large quantities of publicly available product data marked up with Microdata, which boost the performance of the feature extraction model, thus leading to better product matching and categorization performances. Furthermore, we use a deep Convolutional Neural Network to produce image embeddings from product images, which further improve the results on both tasks.
TL;DR: The study provides a fundamental data set that should be of value for a wide variety of research purposes, including probing the statistical and psychological structure of a complex natural category domain, and developing a feature-space representation that can be used in combination with formal models of category learning to predict classification performance.
Abstract: This article reports data sets aimed at the development of a detailed feature-space representation for a complex natural category domain, namely 30 common subtypes of the categories of igneous, metamorphic, and sedimentary rocks. We conducted web searches to develop a library of 12 tokens each of the 30 subtypes, for a total of 360 rock pictures. In one study, subjects provided ratings along a set of 18 hypothesized primary dimensions involving visual characteristics of the rocks. In other studies, subjects provided similarity judgments among pairs of the rock tokens. Analyses are reported to validate the regularity and information value of the dimension ratings. In addition, analyses are reported that derive psychological scaling solutions from the similarity-ratings data and that interrelate the derived dimensions of the scaling solutions with the directly rated dimensions of the rocks. The stimulus set and various forms of ratings data, as well as the psychological scaling solutions, are made available on an online website (https://osf.io/w64fv/) associated with the article. The study provides a fundamental data set that should be of value for a wide variety of research purposes, including: (1) probing the statistical and psychological structure of a complex natural category domain, (2) testing models of similarity judgment, and (3) developing a feature-space representation that can be used in combination with formal models of category learning to predict classification performance in this complex natural category domain.
TL;DR: As early as 6 years of age, children demonstrated greater performance on the incidental categorization task following exposure to multisensory audiovisual cues compared to un isensory information.
Abstract: Multisensory information has been shown to modulate attention in infants and facilitate learning in adults, by enhancing the amodal properties of a stimulus. However, it remains unclear whether this translates to learning in a multisensory environment across middle childhood, and particularly in the case of incidental learning. One hundred and eighty-one children aged between 6 and 10 years participated in this study using a novel Multisensory Attention Learning Task (MALT). Participants were asked to respond to the presence of a target stimulus whilst ignoring distractors. Correct target selection resulted in the movement of the target exemplar to either the upper left or right screen quadrant, according to category membership. Category membership was defined either by visual-only, auditory-only or multisensory information. As early as 6 years of age, children demonstrated greater performance on the incidental categorization task following exposure to multisensory audiovisual cues compared to unisensory information. These findings provide important insight into the use of multisensory information in learning, and particularly on incidental category learning. Implications for the deployment of multisensory learning tasks within education across development will be discussed.
TL;DR: A neuronal population in pre-supplementary motor area is reported whose peak activity predicts the categorical decision boundary between long and short time intervals on a trial-by-trial basis, suggesting that the pre-SMA adaptively encodes subjective duration boundaries between short and long durations.
Abstract: Perceptual categorization depends on the assignment of different stimuli to specific groups based, in principle, on the notion of flexible categorical boundaries. To determine the neural basis of categorical boundaries, we record the activity of pre-SMA neurons of monkeys executing an interval categorization task in which the limit between short and long categories changes between blocks of trials within a session. A large population of cells encodes this boundary by reaching a constant peak of activity close to the corresponding subjective limit. Notably, the time at which this peak is reached changes according to the categorical boundary of the current block, predicting the monkeys’ categorical decision on a trial-by-trial basis. In addition, pre-SMA cells also represent the category selected by the monkeys and the outcome of the decision. These results suggest that the pre-SMA adaptively encodes subjective duration boundaries between short and long durations and contains crucial neural information to categorize intervals and evaluate the outcome of such perceptual decisions. Grouping stimuli into categories often depends on a subjective determination of category boundaries. Here the authors report a neuronal population in pre-supplementary motor area whose peak activity predicts the categorical decision boundary between long and short time intervals on a trial-by-trial basis.