Open Access
Statistical mining in data streams
Edward Y. Chang,Ankur Jain +1 more
- 01 Jan 2006
- pp 293-293
4
TL;DR: This dissertation focuses research issues related to adaptive stream resource conservation and online mining in a DSMS, and proposes adaptive clustering solutions that use the kernel trick to capture non-linear relations in the streaming data.
read more
Abstract: Recent years have seen a steady rise of a new class of data management systems called Data Stream Management Systems (DSMS). These systems manage rapid, high-volume data-streams with transient relations instead of static data with persistent relations. Data streams are common to applications such as network traffic and transaction monitoring systems, click-stream processors, industrial process control, and sensor networks. A DSMS operates on these continuous and time-varying data streams to facilitate on-the-fly query answering, and to support data acquisition, monitoring and analysis.
In this dissertation, we present statistical stream mining solutions for effective online processing of streaming data. We focus research issues related to adaptive stream resource conservation and online mining in a DSMS. We have developed statistical linear and non-linear filtering techniques based on the Kalman Filter to capture temporal correlations in the streaming data. Such correlations help in stream resource conservation. We also propose techniques that capture spatial correlations between the streaming sources that further helps improving resource conservation and facilitates answering group-queries in an efficient manner.
In addition to resource management and query processing, a DSMS needs to address issues related to online stream mining. Once the data stream arrives at a central server, effective mining techniques are necessary for stream analysis, before the data can be discarded. Since a stream continuously evolves with time, stream mining techniques need to be adaptive and should operate under a given memory constraint. We propose adaptive clustering solutions that use the kernel trick to capture non-linear relations in the streaming data. We also present OCODDS, a change-detection approach that can track evolutionary changes in the stream in both linear and non-linear settings. Finally, we present our techniques for effective acquisition and processing of data streams common to video sensor networks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An improved data stream summary: The Count-Min Sketch and its applications
Graham Cormode,S. Muthukrishnan +1 more
- 01 Dec 2004
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.
65
Patent
Dynamic prediction modeling
Chen Jie,Liu Weicheng +1 more
- 30 Jun 2020
TL;DR: In this paper, a method for dynamic predictive modeling is described, which includes creating a joined pair including a snapshot time and the forecast time, stored in the storage device, and a subset of data associated with the joined pair is selected from data stored in storage device.
Streaming Decision Tree for Continuity Data with Changed Pattern
TL;DR: Wang et al. as discussed by the authors introduced Streaming Decision Tree (SDT) for analyzing data with continuity, large scale, and changed patterns, which is mainly used for pattern extracting and information discovery from collected data.
Streaming Decision Tree for Continuity Data with Changed
Tae-Bok Yoon,Hak-Joon Sim,Jee-Hyong Lee,YoungMee Choi +3 more
- 01 Jan 2010
TL;DR: This paper introduces Streaming Decision Tree (SDT) analyzing data with continuity, large scale, and changed patterns, and applies time series data and confirmed resonable result.
References
•Book
Elements of information theory
Thomas M. Cover,Joy A. Thomas +1 more
- 01 Jan 1991
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Statistical learning theory
Vladimir Vapnik
- 01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
30.4K
A New Approach to Linear Filtering and Prediction Problems
Tamer Basar
- 01 Jan 2001
TL;DR: In this paper, the clssical filleting and prediclion problem is re-examined using the Bode-Shannon representation of random processes and the?stat-tran-sition? method of analysis of dynamic systems.
22.7K
•Book
Principal Component Analysis
Ian T. Jolliffe
- 01 May 1986
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
17.7K