Monocular

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Proceedings Article•10.1109/CVPR.2017.700•

Unsupervised Learning of Depth and Ego-Motion from Video

[...]

Tinghui Zhou¹, Matthew Brown², Noah Snavely², David G. Lowe²•Institutions (2)

University of California, Berkeley¹, Google²

25 Apr 2017

TL;DR: In this paper, an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences is presented, which uses single-view depth and multiview pose networks with a loss based on warping nearby views to the target using the computed depth and pose.

...read moreread less

Abstract: We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. In common with recent work [10, 14, 16], we use an end-to-end learning approach with view synthesis as the supervisory signal. In contrast to the previous work, our method is completely unsupervised, requiring only monocular video sequences for training. Our method uses single-view depth and multiview pose networks, with a loss based on warping nearby views to the target using the computed depth and pose. The networks are thus coupled by the loss during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performs favorably compared to established SLAM systems under comparable input settings.

...read moreread less

2,774 citations

Proceedings Article•10.1109/ICCV.2019.00393•

Digging Into Self-Supervised Monocular Depth Estimation

[...]

Clément Godard¹, Oisin Mac Aodha², Michael Firman, Gabriel J. Brostow¹•Institutions (2)

University College London¹, California Institute of Technology²

1 Oct 2019

TL;DR: In this paper, the authors propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods, and demonstrate the effectiveness of each component in isolation, and show high quality, state-of-theart results on the KITTI benchmark.

...read moreread less

Abstract: Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.

...read moreread less

1,875 citations

Journal Article•10.1113/JPHYSIOL.1967.SP008360•

The neural mechanism of binocular depth discrimination

[...]

Horace Barlow, Colin Blakemore, John D. Pettigrew

01 Nov 1967-The Journal of Physiology

TL;DR: Binocularly driven units were investigated in the cat's primary visual cortex in a bid to understand why cats have good night vision and why cats with poor vision have poor daytime vision.

...read moreread less

Abstract: 1. Binocularly driven units were investigated in the cat's primary visual cortex. 2. It was found that a stimulus located correctly in the visual fields of both eyes was more effective in driving the units than a monocular stimulus, and much more effective than a binocular stimulus which was correctly positioned in only one eye: the response to the correctly located image in one eye is vetoed if the image is incorrectly located in the other eye. 3. The vertical and horizontal disparities of the paired retinal images that yielded the maximum response were measured in 87 units from seven cats: the range of horizontal disparities was 6·6°, of vertical disparities 2·2°. 4. With fixed convergence, different units will be optimally excited by objects lying at different distances. This may be the basic mechanism underlying depth discrimination in the cat.

...read moreread less

1,453 citations

Posted Content•

Digging Into Self-Supervised Monocular Depth Estimation

[...]

Clément Godard¹, Oisin Mac Aodha², Michael Firman, Gabriel J. Brostow¹•Institutions (2)

University College London¹, California Institute of Technology²

04 Jun 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods.

...read moreread less

1,361 citations

Proceedings Article•10.1109/CVPR.2016.236•

Monocular 3D Object Detection for Autonomous Driving

[...]

Xiaozhi Chen¹, Kaustav Kundu², Ziyu Zhang², Huimin Ma¹, Sanja Fidler², Raquel Urtasun² - Show less +2 more•Institutions (2)

Tsinghua University¹, University of Toronto²

27 Jun 2016

TL;DR: This work proposes an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.

...read moreread less

Abstract: The goal of this paper is to perform 3D object detection from a single monocular image in the domain of autonomous driving. Our method first aims to generate a set of candidate class-specific object proposals, which are then run through a standard CNN pipeline to obtain highquality object detections. The focus of this paper is on proposal generation. In particular, we propose an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane. We then score each candidate box projected to the image plane via several intuitive potentials encoding semantic segmentation, contextual information, size and location priors and typical object shape. Our experimental evaluation demonstrates that our object proposal generation approach significantly outperforms all monocular approaches, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.

...read moreread less

1,326 citations

...

Expand

Year	Papers
2026	7
2025	475
2024	890
2023	889
2022	1,329
2021	263

Topic Tools

Papers published on a yearly basis

Papers

Unsupervised Learning of Depth and Ego-Motion from Video

Digging Into Self-Supervised Monocular Depth Estimation

The neural mechanism of binocular depth discrimination

Digging Into Self-Supervised Monocular Depth Estimation

Monocular 3D Object Detection for Autonomous Driving

Related Topics (5)

Performance Metrics