Open AccessPosted Content
From Coarse to Fine: Robust Hierarchical Localization at Large Scale
TL;DR: HF-Net as discussed by the authors proposes a hierarchical approach based on a monolithic CNN that simultaneously predicts local features and global descriptors for accurate 6-DoF localization, which achieves remarkable localization robustness across large variations of appearance and sets a new state-of-theart on two challenging benchmarks for large-scale localization.
read more
Abstract: Robust and accurate visual localization is a fundamental capability for numerous applications, such as autonomous driving, mobile robotics, or augmented reality. It remains, however, a challenging task, particularly for large-scale environments and in presence of significant appearance changes. State-of-the-art methods not only struggle with such scenarios, but are often too resource intensive for certain real-time applications. In this paper we propose HF-Net, a hierarchical localization approach based on a monolithic CNN that simultaneously predicts local features and global descriptors for accurate 6-DoF localization. We exploit the coarse-to-fine localization paradigm: we first perform a global retrieval to obtain location hypotheses and only later match local features within those candidate places. This hierarchical approach incurs significant runtime savings and makes our system suitable for real-time operation. By leveraging learned descriptors, our method achieves remarkable localization robustness across large variations of appearance and sets a new state-of-the-art on two challenging benchmarks for large-scale localization.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Improving Long Term Accuracy of Visual Localization in Urban Environment
N. Niparnan,S. Sattaratnamai,A. Sudsang +2 more
Automatic orientation of historical terrestrial images in mountainous terrain using the visible horizon
TL;DR: In this paper , the horizon is used to estimate both the interior and exterior orientation of historical terrestrial images, with an accuracy comparable to manually oriented images, using salient points along the horizon.
Autonomous Robot Relocalization with Motion Constraints
Robert Konievic,Victor Domșa,Levente Tamás +2 more
- 16 May 2024
TL;DR: Relocalization of autonomous robot with motion constraints based on a motion-constrained camera and Bayesian estimation.
An information-theoretic approach to unsupervised keypoint representation learning
TL;DR: It is argued that local entropy of pixel neighborhoods and its evolution in a video stream is a valuable intrinsic supervisory signal for learning to attend to salient features and therefore, abstract visual features into a concise representation of keypoints that serve as dynamic information transporters.
Query Neural Surface Description for Camera Pose Refinement
Hugo Germain,Daniel DeTone,Geoffrey Pascoe,Timo Schmidt,David Novotny,Richard Newcombe,Chris Sweeney,Richard Szeliski,Vasileios Balntas +8 more
TL;DR: The Feature Query Network is introduced, a ray-based descriptor re-gressor that can be used to query descriptors at known 3D locations under novel viewpoints and is able to model viewpoint-dependency of high-dimensional keypoint descriptors and bring significant relative improvements to structure-based visual localization baselines.
References
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
•Posted Content
Distilling the Knowledge in a Neural Network
TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
21.2K
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler,Andrew Howard,Menglong Zhu,Andrey Zhmoginov,Liang-Chieh Chen +4 more
- 18 Jun 2018
TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.