Journal Article10.1016/J.NEUCOM.2017.01.078
A survey on data preprocessing for data stream mining
474
TL;DR: This survey summarizes, categorize and analyze those contributions on data preprocessing that cope with streaming data, and takes into account the existing relationships between the different families of methods (feature and instance selection, and discretization).
read more
About: This article is published in Neurocomputing. The article was published on 24 May 2017. The article focuses on the topics: Data stream mining & Data pre-processing.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary
TL;DR: The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data because of its simplicity in the design, as well as its robustness when applied to different type of problems.
Learning under Concept Drift: A Review
TL;DR: A high quality, instructive review of current research developments and trends in the concept drift field is conducted, and a framework of learning under concept drift is established including three main components: concept drift detection, concept drift understanding, and concept drift adaptation.
995
Learning under Concept Drift: A Review
TL;DR: In this paper, the authors present a review of the recent research in the field of concept drift and propose a framework of learning under concept drift. But, the focus of this survey is on the detection, understanding and adaptation of the concept drift in streaming data.
752
Machine learning for streaming data: state of the art, challenges, and opportunities
TL;DR: Incremental learning, online learning, and data stream learning are terms commonly associated with learning algorithms that update their models given a continuous influx of data without performing any act of reinforcement learning.
261
Spiking Neural Networks and online learning: An overview and perspectives
TL;DR: In this article, the authors present a comprehensive overview of the use of Spiking Neural Networks for online learning in non-stationary data streams and propose a new algorithm to adapt to these changes as fast as possible, while maintaining good performance scores.
250
References
•Book
Principal Component Analysis
Ian T. Jolliffe
- 01 May 1986
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
17.7K
Nearest neighbor pattern classification
Thomas M. Cover,Peter E. Hart +1 more
TL;DR: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points, so it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.
Principal Component Analysis
I. Jolliffe
- 01 Oct 2002
TL;DR: This chapter discusses the properties of Population Principal Components, and the role of Principal Components in Regression Analysis, and discusses generalizations and Adaptations of Principal Component Analysis.
8.6K
Domain-adversarial training of neural networks
Yaroslav Ganin,Evgeniya Ustinova,Hana Ajakan,Pascal Germain,Hugo Larochelle,François Laviolette,Mario Marchand,Victor Lempitsky +7 more
TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.
Domain-Adversarial Training of Neural Networks.
Yaroslav Ganin,Evgeniya Ustinova,Hana Ajakan,Pascal Germain,Hugo Larochelle,François Laviolette,Mario Marchand,Victor Lempitsky +7 more
- 01 Jan 2017
TL;DR: A new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions, which can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer.
5.6K
Related Papers (5)
Geoff Hulten,Laurie Spencer,Pedro Domingos +2 more
- 26 Aug 2001
Albert Bifet,Ricard Gavaldà +1 more
- 01 Jan 2007
Pedro Domingos,Geoff Hulten +1 more
- 01 Aug 2000