Proceedings Article10.1109/ICCV.2013.330
Dynamic Structured Model Selection
David J. Weiss,Benjamin Sapp,Ben Taskar +2 more
- 01 Dec 2013
- pp 2656-2663
TL;DR: This work proposes a novel two-tier architecture that provides dynamic speed/accuracy trade-offs through a simple type of introspection, and establishes a new state-of-the-art in human pose estimation in video with an implementation that is roughly 23× faster than the previous standard implementation.
read more
Abstract: In many cases, the predictive power of structured models for for complex vision tasks is limited by a trade-off between the expressiveness and the computational tractability of the model. However, choosing this trade-off statically a priori is sub optimal, as images and videos in different settings vary tremendously in complexity. On the other hand, choosing the trade-off dynamically requires knowledge about the accuracy of different structured models on any given example. In this work, we propose a novel two-tier architecture that provides dynamic speed/accuracy trade-offs through a simple type of introspection. Our approach, which we call dynamic structured model selection (DMS), leverages typically intractable features in structured learning problems in order to automatically determine' which of several models should be used at test-time in order to maximize accuracy under a fixed budgetary constraint. We demonstrate DMS on two sequential modeling vision tasks, and we establish a new state-of-the-art in human pose estimation in video with an implementation that is roughly 23× faster than the previous standard implementation.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multiclass semantic video segmentation with object-level active inference
TL;DR: An efficient mean field inference algorithm is developed to jointly infer the supervoxel labels, object activations and their occlusion relations for a moderate number of object hypotheses in an object-augmented dense CRF in spatio-temporal domain.
Budget-Aware Deep Semantic Video Segmentation
Behrooz Mahasseni,Sinisa Todorovic,Alan Fern +2 more
- 01 Jul 2017
TL;DR: This work formalizes the frame selection as a Markov Decision Process, and specifies a Long Short-Term Memory network to model a policy for selecting the frames, and develops a policy-gradient reinforcement-learning approach for approximating the gradient of the authors' non-decomposable and non-differentiable objective.
Anytime Recognition of Objects and Scenes
Sergey Karayev,Mario Fritz,Trevor Darrell +2 more
- 23 Jun 2014
TL;DR: A method for learning dynamic policies to optimize Anytime performance in visual architectures and can incorporate a semantic back-off strategy that gives maximally specific predictions for a desired level of accuracy, which provides a new view on the time course of human visual perception.
72
•Proceedings Article
Adaptive Classification for Prediction Under a Budget
Feng Nan,Venkatesh Saligrama +1 more
- 01 May 2017
TL;DR: In this article, the authors propose an adaptive approximation approach for test-time resource-constrained prediction motivated by Mobile, IoT, health, security and other applications, where constraints in the form of computation, communication, latency and feature acquisition costs arise.
•Proceedings Article
Learning Adaptive Value of Information for Structured Prediction
David J. Weiss,Ben Taskar +1 more
- 05 Dec 2013
TL;DR: This work addresses the key challenge of learning to control fine-grained feature extraction adaptively by proposing an architecture that uses a rich feedback loop between extraction and prediction, and demonstrates significant speedups over state-of-the-art methods on two challenging datasets.
References
Robust Real-Time Face Detection
Paul A. Viola,Michael Jones +1 more
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
14.6K
Robust real-time face detection
Paul A. Viola,Michael Jones +1 more
- 07 Jul 2001
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Efficient Graph-Based Image Segmentation
TL;DR: An efficient segmentation algorithm is developed based on a predicate for measuring the evidence for a boundary between two regions using a graph-based representation of the image and it is shown that although this algorithm makes greedy decisions it produces segmentations that satisfy global properties.
Articulated pose estimation with flexible mixtures-of-parts
Yi Yang,Deva Ramanan +1 more
- 20 Jun 2011
TL;DR: A general, flexible mixture model for capturing contextual co-occurrence relations between parts, augmenting standard spring models that encode spatial relations, and it is shown that such relations can capture notions of local rigidity.
Beyond pixels: exploring new representations and applications for motion analysis
William T. Freeman,Edward H. Adelson,Ce Liu +2 more
- 01 Jan 2009
TL;DR: This thesis builds a human-assisted motion annotation system to obtain ground-truth motion, missing in the literature, for natural video sequences, and proposes SIFT flow, a new framework for image parsing by transferring the metadata information from the images in a large database to an unknown query image.
1K
Related Papers (5)
David J. Weiss,Ben Taskar +1 more
- 05 Dec 2013
Kirill Trapeznikov,Venkatesh Saligrama +1 more
- 29 Apr 2013