Open AccessPosted Content
From Coarse to Fine: Robust Hierarchical Localization at Large Scale
TL;DR: HF-Net as discussed by the authors proposes a hierarchical approach based on a monolithic CNN that simultaneously predicts local features and global descriptors for accurate 6-DoF localization, which achieves remarkable localization robustness across large variations of appearance and sets a new state-of-theart on two challenging benchmarks for large-scale localization.
read more
Abstract: Robust and accurate visual localization is a fundamental capability for numerous applications, such as autonomous driving, mobile robotics, or augmented reality. It remains, however, a challenging task, particularly for large-scale environments and in presence of significant appearance changes. State-of-the-art methods not only struggle with such scenarios, but are often too resource intensive for certain real-time applications. In this paper we propose HF-Net, a hierarchical localization approach based on a monolithic CNN that simultaneously predicts local features and global descriptors for accurate 6-DoF localization. We exploit the coarse-to-fine localization paradigm: we first perform a global retrieval to obtain location hypotheses and only later match local features within those candidate places. This hierarchical approach incurs significant runtime savings and makes our system suitable for real-time operation. By leveraging learned descriptors, our method achieves remarkable localization robustness across large variations of appearance and sets a new state-of-the-art on two challenging benchmarks for large-scale localization.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Object-Augmented RGB-D SLAM for Wide-Disparity Relocalisation.
Yuhang Ming,Xingrui Yang,Andrew Calway +2 more
- 01 Jan 2021
TL;DR: This work proposes a novel object-augmented RGB-D SLAM system that is capable of constructing a consistent object map and performing relocalisation based on centroids of objects in the map, significantly outperforming two appearance- based methods.
DeepMEL: Compiling Visual Multi-Experience Localization into a Deep Neural Network
Mona Gridseth,Timothy D. Barfoot +1 more
- 01 May 2020
TL;DR: This paper uses multi-experience VT&R together with two datasets of outdoor driving on two separate paths spanning different times of day, weather, and seasons to teach a deep neural network to predict relative pose for visual odometry (VO) and for localization with respect to a path.
9
Spatio-Temporal Graph Localization Networks for Image-based Navigation
Takahiro Niwa,Shun Taguchi,Noriaki Hirose +2 more
- 28 Apr 2022
TL;DR: This work proposes a learning-based localization method that simultaneously utilizes the spatial consistency from topological maps and the temporal consistency from time-series images captured by the robot to perform accurate localization.
8
Retrieval and Localization with Observation Constraints
Yuhao Zhou,Huanhuan Fan,Shuang Gao,Yuchen Yang,Xudong Zhang,Jijunnan Li,Yandong Guo +6 more
- 30 May 2021
TL;DR: Zhang et al. as discussed by the authors proposed an integrated visual re-localization method called RLOCS by combining image retrieval, semantic consistency and geometry verification to achieve accurate estimations.
8
Semi-Dense Feature Matching With Transformers and its Applications in Multiple-View Geometry
TL;DR: LoftR as mentioned in this paper uses self and cross attention layers in Transformer to obtain feature descriptors that are conditioned on both images, which enables the method to produce dense matches in lowtexture areas, where feature detectors usually struggle to produce repeatable interest points.
8
References
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
•Posted Content
Distilling the Knowledge in a Neural Network
TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
21.2K
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler,Andrew Howard,Menglong Zhu,Andrey Zhmoginov,Liang-Chieh Chen +4 more
- 18 Jun 2018
TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.