TL;DR: In this article, the authors explore a number of novel loss functions for learning camera pose which are based on geometry and scene reprojection error, and show how to automatically learn an optimal weighting to simultaneously regress position and orientation.
Abstract: Deep learning has shown to be effective for robust and real-time monocular image relocalisation. In particular, PoseNet [22] is a deep convolutional neural network which learns to regress the 6-DOF camera pose from a single image. It learns to localize using high level features and is robust to difficult lighting, motion blur and unknown camera intrinsics, where point based SIFT registration fails. However, it was trained using a naive loss function, with hyper-parameters which require expensive tuning. In this paper, we give the problem a more fundamental theoretical treatment. We explore a number of novel loss functions for learning camera pose which are based on geometry and scene reprojection error. Additionally we show how to automatically learn an optimal weighting to simultaneously regress position and orientation. By leveraging geometry, we demonstrate that our technique significantly improves PoseNets performance across datasets ranging from indoor rooms to a small city.
TL;DR: This paper addresses the problem of image-based surface reconstruction by showing the strong influence that the movement of the contour generators has on the reprojection error and allowing its rigorous minimization via gradient descent surface evolution.
Abstract: This paper addresses the problem of image-based surface reconstruction. The main contribution is the computation of the exact derivative of the reprojection error functional. This allows its rigorous minimization via gradient descent surface evolution. The main difficulty has been to correctly take into account the visibility changes that occur when the surface moves. A geometric and analytical study of these changes is presented and used for the computation of derivative. Our analysis shows the strong influence that the movement of the contour generators has on the reprojection error. As a consequence, during the proper minimization of the reprojection error, the contour generators of the surface are automatically moved to their correct location in the images. Therefore, current methods adding additional silhouettes or apparent contour constraints to ensure this alignment can now be understood and justified by a single criterion: the reprojection error.
TL;DR: In this article, the authors provide an extensive review of existing models for large field-of-view cameras and propose the Double Sphere camera model, which is computationally inexpensive and has a closed-form inverse.
Abstract: Vision-based motion estimation and 3D reconstruction, which have numerous applications (e.g., autonomous driving, navigation systems for airborne devices and augmented reality) are receiving significant research attention. To increase the accuracy and robustness, several researchers have recently demonstrated the benefit of using large field-of-view cameras for such applications. In this paper, we provide an extensive review of existing models for large field-of-view cameras. For each model we provide projection and unprojection functions and the subspace of points that result in valid projection. Then, we propose the Double Sphere camera model that well fits with large field-of-view lenses, is computationally inexpensive and has a closed-form inverse. We evaluate the model using a calibration dataset with several different lenses and compare the models using the metrics that are relevant for Visual Odometry, i.e., reprojection error, as well as computation time for projection and unprojection functions and their Jacobians. We also provide qualitative results and discuss the performance of all models.
TL;DR: The IBIS platform is the first open-source navigation system to provide a complete solution for AR visualization and has been used in the operating room for various types of surgery, including brain tumor resection, vascular neurosurgery, spine surgery and DBS electrode implantation.
Abstract: Navigation systems commonly used in neurosurgery suffer from two main drawbacks: (1) their accuracy degrades over the course of the operation and (2) they require the surgeon to mentally map images from the monitor to the patient. In this paper, we introduce the Intraoperative Brain Imaging System (IBIS), an open-source image-guided neurosurgery research platform that implements a novel workflow where navigation accuracy is improved using tracked intraoperative ultrasound (iUS) and the visualization of navigation information is facilitated through the use of augmented reality (AR). The IBIS platform allows a surgeon to capture tracked iUS images and use them to automatically update preoperative patient models and plans through fast GPU-based reconstruction and registration methods. Navigation, resection and iUS-based brain shift correction can all be performed using an AR view. IBIS has an intuitive graphical user interface for the calibration of a US probe, a surgical pointer as well as video devices used for AR (e.g., a surgical microscope). The components of IBIS have been validated in the laboratory and evaluated in the operating room. Image-to-patient registration accuracy is on the order of $$3.72\pm 1.27\,\hbox {mm}$$
and can be improved with iUS to a median target registration error of 2.54 mm. The accuracy of the US probe calibration is between 0.49 and 0.82 mm. The average reprojection error of the AR system is $$0.37\pm 0.19\,\hbox {mm}$$
. The system has been used in the operating room for various types of surgery, including brain tumor resection, vascular neurosurgery, spine surgery and DBS electrode implantation. The IBIS platform is a validated system that allows researchers to quickly bring the results of their work into the operating room for evaluation. It is the first open-source navigation system to provide a complete solution for AR visualization.
TL;DR: The proposed hand–eye calibration technique based on reprojection error minimization is implemented as a pose graph optimization problem, so that it can solve the estimation problem efficiently and robustly, and it can be easily extended for different projection models.
Abstract: This letter describes a novel hand–eye calibration technique based on reprojection error minimization. In contrast to traditional hand–eye calibration methods, the proposed method directly takes images of the calibration pattern and does not require to explicitly estimate the camera pose for each input image. The proposed method is implemented as a pose graph optimization problem, so that it can solve the estimation problem efficiently and robustly, and it can be easily extended for different projection models. It can deal with different camera models (e.g, X-ray cameras with a source-detector projection model) by changing the projection model. Through simulations, we validated that the proposed method shows a good estimation accuracy, and it can be applied to hand–eye calibration with a source-detector camera model. The experimental results with real robots show that the proposed method is applicable to real environments, and it improves the quality of a task that requires accurate hand–eye estimation, such as three-dimensional reconstruction.