Journal Article10.1109/cvpr52729.2023.00508
Spatiotemporal Self-Supervised Learning for Point Clouds in the Wild
Yanhao Wu,Tong Zhang,Wei Ke,Sabine Süsstrunk,Mathieu Salzmann +4 more
- 01 Jun 2023
14
TL;DR: Spatiotemporal self-supervised learning (STSSL) for point clouds in the wild achieves superior performance by leveraging positive pairs in both spatial and temporal domains.
read more
Abstract: Self-supervised learning (SSL) has the potential to benefit many applications, particularly those where manually annotating data is cumbersome. One such situation is the semantic segmentation of point clouds. In this context, existing methods employ contrastive learning strategies and define positive pairs by performing various augmentation of point clusters in a single frame. As such, these methods do not exploit the temporal nature of LiDAR data. In this paper, we introduce an SSL strategy that leverages positive pairs in both the spatial and temporal domain. To this end, we design (i) a point-to-cluster learning strategy that aggregates spatial information to distinguish objects; and (ii) a cluster-to-cluster learning strategy based on unsupervised object tracking that exploits temporal correspondences. We demonstrate the benefits of our approach via extensive experiments performed by self-supervised training on two large-scale LiDAR datasets and transferring the resulting models to other point cloud segmentation benchmarks. Our results evidence that our method outperforms the state-of-the-art point cloud SSL methods. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> Our code and pretrained models will be found at https://github.com/YanhaoWu/STSSL. Correspondence to Ke Wei.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Revisiting the Distillation of Image Representations into Point Clouds for Autonomous Driving
Gilles Puy,Spyros Gidaris,A. Boulch,Oriane Sim'eoni,Corentin Sautier,Patrick P'erez,Andrei Bursuc,Renaud Marlet +7 more
TL;DR: This work shows that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality, and allows us to significantly reduce the gap between the quality of distilled and fully-supervised 3D features, and to improve the robustness of the pretrained backbones to domain gaps and perturbations.
6
BEVContrast: Self-Supervision in BEV Space for Automotive Lidar Point Clouds
Corentin Sautier,Gilles Puy,Alexandre Boulch,Renaud Marlet,Vincent Lepetit +4 more
- 18 Mar 2024
TL;DR: BEVContrast proposes a novel self-supervision method for 3D backbone on automotive Lidar point clouds based on a contrastive loss at the level of 2D cells in the Bird's Eye View plane.
6
Multi-stage Scene-level Constraints for Large-scale Point Cloud Weakly Supervised Semantic Segmentation
Yanfei Su,Ming Cheng,Zhimin Yuan,Weiquan Liu,Wankang Zeng,Cheng Wang +5 more
TL;DR: An effective and generalized weakly supervised semantic segmentation framework, called multistage scene-level constraints (MSCs), is proposed to solve the issue regarding inadequate labeled data and an uncertainty-guided adaptive reweighting strategy to reduce the negative impact of erroneous pseudo-labeled data on the model learning process.
5
Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation
Hai Yan,Rui Song,Jiaojiao Li,David Ferstl,Yinlin Hu +4 more
- 01 Oct 2023
TL;DR: Self-supervised 6D object pose estimation method that utilizes pseudo flow consistency between training images without additional information.
4
Generating Point Cloud Augmentations via Class-Conditioned Diffusion Model
Gulshan Sharma,Chetan Gupta,Aastha Agarwal,Lalit Sharma,A. Dhall +4 more
- 01 Jan 2024
TL;DR: The findings suggest that the proposed approach effectively generates high-quality synthetic embeddings directly from the Gaussian noise and improves the classification performance of the point cloud classes within limited data settings.
4
References
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
•Proceedings Article
A density-based algorithm for discovering clusters in large spatial Databases with Noise
Martin Ester,Hans-Peter Kriegel,Jörg Sander,Xiaowei Xu +3 more
- 01 Jan 1996
TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
•Posted Content
A Simple Framework for Contrastive Learning of Visual Representations
TL;DR: It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.
The Hungarian method for the assignment problem
TL;DR: This paper has always been one of my favorite children, combining as it does elements of the duality of linear programming and combinatorial tools from graph theory, and it may be of some interest to tell the story of its origin this article.
Vision meets robotics: The KITTI dataset
TL;DR: A novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research, using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras and a high-precision GPS/IMU inertial navigation system.