Journal Article10.1109/ACCESS.2023.3284463
Enhancing Conventional Geometry-Based Visual Odometry Pipeline Through Integration of Deep Descriptors
Muhammad Bilal,Muhammad Shehzad Hanif,Khalid Munawar,Ubaid M. Al-Saggaf +3 more
- Vol. 11, pp 58294-58307
TL;DR: In this paper , the authors integrate deep descriptors to improve the correspondence between image points for tracking in a traditional geometry-based visual odometry (VO) pipeline and propose a simple stereo VO pipeline inspired by popular techniques found in the literature.
read more
Abstract: Geometry-based Visual Odometry (VO) techniques are renowned in the fields of computer vision and robotics. They use methods from multi-view geometry to estimate camera motion from visual data obtained from one or more cameras. Tracking the camera motion precisely between different views is dependent on the correct estimation of correspondences between salient points of the views. In practice, geometry-based methods are found to be quite effective but do not perform well in challenging cases due to tracking failures caused by abrupt motion, occlusions, textureless and low-light scenes, etc. On the contrary, end-to-end learning from visual data using deep neural networks is an emerging area of research and deals with challenging cases successfully. Despite being computationally expensive, these methods do not outperform their counterparts in conditions favorable to geometry-based methods. Considering these facts in this work, our goal is to integrate deep descriptors to improve the correspondence between image points for tracking in a traditional geometry-based VO pipeline. We propose a simple stereo VO pipeline inspired by popular techniques found in the literature. Two conventional and four deep descriptors have been used in our experiments conducted on various image sequences of the challenging KITTI benchmark dataset. We have determined empirically that deep descriptors can effectively minimize drift in the VO estimates and produce better camera trajectories. The experimental results on the KITTI dataset demonstrate that our VO method performs at par with the state-of-the-art works reported in the literature.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
•Posted Content
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
25.9K
Object recognition from local scale-invariant features
David G. Lowe
- 20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger,Philip Lenz,Raquel Urtasun +2 more
- 16 Jun 2012
TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.