Open AccessPosted Content
Unsupervised Adversarial Depth Estimation using Cycled Generative Networks
TL;DR: A novel unsupervised deep learning approach for predicting depth maps and showing that the depth estimation task can be effectively tackled within an adversarial learning framework is presented.
read more
Abstract: While recent deep monocular depth estimation approaches based on supervised regression have achieved remarkable performance, costly ground truth annotations are required during training. To cope with this issue, in this paper we present a novel unsupervised deep learning approach for predicting depth maps and show that the depth estimation task can be effectively tackled within an adversarial learning framework. Specifically, we propose a deep generative network that learns to predict the correspondence field i.e. the disparity map between two image views in a calibrated stereo camera setting. The proposed architecture consists of two generative sub-networks jointly trained with adversarial learning for reconstructing the disparity map and organized in a cycle such as to provide mutual constraints and supervision to each other. Extensive experiments on the publicly available datasets KITTI and Cityscapes demonstrate the effectiveness of the proposed model and competitive results with state of the art methods. The code and trained model are available on this https URL.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction
TL;DR: This paper introduces multi-task learning schemes to incorporate semanticawareness into the representation of depth features and proposes SAFENet that is designed to leverage semantic information to overcome the limitations of the photometric loss.
43
•Posted Content
Unsupervised Learning of Monocular Depth Estimation with Bundle Adjustment, Super-Resolution and Clip Loss.
TL;DR: Experimental results on the KITTI dataset show that the proposed algorithm outperforms the state-of-the-art unsupervised methods using monocular sequences, and achieves comparable or even better result compared to unsuper supervised methods using stereo sequences.
31
•Posted Content
An Overview of Perception and Decision-Making in Autonomous Systems in the Era of Learning
Yang Tang,Chaoqiang Zhao,Jianrui Wang,Chongzhen Zhang,Qiyu Sun,Weixing Zheng,Wenli Du,Feng Qian,Juergen Kurths +8 more
TL;DR: This review delineates the existing classical simultaneous localization and mapping (SLAM) solutions and review the environmental perception and understanding methods based on deep learning, including deep learning-based monocular depth estimation, ego-motion prediction, image enhancement, object detection, semantic segmentation, and their combinations with traditional SLAM frameworks.
24
Simultaneous Semantic Segmentation and Depth Completion with Constraint of Boundary.
TL;DR: The proposed multi-task CNN model can effectively improve the performance of every single task and is implemented end-to-end and evaluated with both RGB and sparse depth input.
22
Object Detection and Depth Estimation Approach Based on Deep Convolutional Neural Networks
TL;DR: In this article, a real-time object detection and depth estimation approach based on deep convolutional neural networks (CNNs) is proposed, which improves object detection through the incorporation of transfer connection blocks (TCBs), in particular, to detect small objects in real time.
16
References
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
- 08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu,Taesung Park,Phillip Isola,Alexei A. Efros +3 more
- 01 Oct 2017
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
19.5K
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger,Philip Lenz,Raquel Urtasun +2 more
- 16 Jun 2012
TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.
•Posted Content
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts,Mohamed Omran,Sebastian Ramos,Timo Rehfeld,Markus Enzweiler,Rodrigo Benenson,Uwe Franke,Stefan Roth,Bernt Schiele +8 more
TL;DR: Cityscapes as discussed by the authors is a large-scale dataset for semantic urban scene understanding, consisting of 5000 images with high quality pixel-level annotations and 200,000 additional images with coarse annotations.
7.8K
Indoor segmentation and support inference from RGBD images
Nathan Silberman,Derek Hoiem,Pushmeet Kohli,Rob Fergus +3 more
- 07 Oct 2012
TL;DR: The goal is to parse typical, often messy, indoor scenes into floor, walls, supporting surfaces, and object regions, and to recover support relationships, to better understand how 3D cues can best inform a structured 3D interpretation.