Open AccessProceedings Article
Detecting Significant Multidimensional Spatial Clusters
Daniel B. Neill,Andrew W. Moore,Francisco Pereira,Tom M. Mitchell +3 more
- 01 Dec 2004
- Vol. 17, pp 969-976
TL;DR: A novel fast spatial scan algorithm is introduced, generalizing the 2D scan algorithm of (Neill and Moore, 2004) to arbitrary dimensions, which allows it to find spatial clusters up to 1400x faster than the naive spatial scan, without any loss of accuracy.
read more
Abstract: Assume a uniform, multidimensional grid of bivariate data, where each cell of the grid has a count ci and a baseline bi. Our goal is to find spatial regions (d-dimensional rectangles) where the ci are significantly higher than expected given bi. We focus on two applications: detection of clusters of disease cases from epidemiological data (emergency department visits, over-the-counter drug sales), and discovery of regions of increased brain activity corresponding to given cognitive tasks (from fMRI data). Each of these problems can be solved using a spatial scan statistic (Kulldorff, 1997), where we compute the maximum of a likelihood ratio statistic over all spatial regions, and find the significance of this region by randomization. However, computing the scan statistic for all spatial regions is generally computationally infeasible, so we introduce a novel fast spatial scan algorithm, generalizing the 2D scan algorithm of (Neill and Moore, 2004) to arbitrary dimensions. Our new multidimensional multiresolution algorithm allows us to find spatial clusters up to 1400x faster than the naive spatial scan, without any loss of accuracy.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Identifying patterns in spatial information: a survey of methods
TL;DR: This paper explores the emerging field of spatial data mining, focusing on different methods to extract patterns from spatial information, and concludes with a look at future research needs.
203
Expectation-based scan statistics for monitoring spatial time series data
TL;DR: This work evaluates several variants of the expectation-based scan statistic on the disease surveillance task (using synthetic outbreaks injected into real-world hospital Emergency Department data), and draws conclusions about which models and methods are most appropriate for which surveillance tasks.
99
An empirical Bayes approach to detect anomalies in dynamic multidimensional arrays
D. Agarwal
- 27 Nov 2005
TL;DR: An empirical Bayes method is used which works by fitting a two component Gaussian mixture to deviations at current time to suppress deviations that are merely the consequence of sharp changes in the marginal distributions.
61
•Proceedings Article
Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions
Bangpeng Yao,Dirk B. Walther,Diane M. Beck,Li Fei-Fei +3 more
- 07 Dec 2009
TL;DR: This paper proposes an Hidden Conditional Random Field framework, where the classifier of one region of interest (ROI) makes predictions based on not only its voxels but also the predictions from ROIs that it connects to, and proposes a structural learning method in the HCRF framework to automatically uncover the connections between ROIs.
DISCOV: A Framework for Discovering Objects in Video
David Liu,Tsuhan Chen +1 more
TL;DR: This paper presents a probabilistic framework for discovering objects in video that provides robustness to object variations in scale, lighting and viewpoint, and presents applications of video object discovery to video content analysis problems such as video segmentation and threading.
33
References
Multidimensional binary search trees used for associative searching
TL;DR: The multidimensional binary search tree (or k-d tree) as a data structure for storage of information to be retrieved by associative searches is developed and it is shown to be quite efficient in its storage requirements.
8.2K
A spatial scan statistic
TL;DR: In this article, a spatial scan statistic for the detection of clusters in a multi-dimensional point process is proposed, where the area of the scanning window is allowed to vary, and the baseline process may be any inhomogeneous Poisson process or Bernoulli process with intensity pro-portional to some known function.
3.8K
Automatic subspace clustering of high dimensional data for data mining applications
Rakesh Agrawal,Johannes Gehrke,Dimitrios Gunopulos,Prabhakar Raghavan +3 more
- 01 Jun 1998
TL;DR: CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.
Quad trees a data structure for retrieval on composite keys
TL;DR: An optimized tree is defined and an algorithm to accomplish optimization in n log n time is presented, guaranteeing that Searching is guaranteed to be fast in optimized trees.
2.2K
•Proceedings Article
STING: A Statistical Information Grid Approach to Spatial Data Mining
Wei Wang,Jiong Yang,Richard R. Muntz +2 more
- 25 Aug 1997
TL;DR: The idea is to capture statistical information associated with spatial cells in such a manner that whole classes of queries and clustering problems can be answered without recourse to the individual objects.