(AF) 2 -S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
Ran Cheng,Ryan Razani,Ehsan Taghavi,Enxu Li,Bingbing Liu +4 more
- 20 Jun 2021
- pp 12547-12556
TL;DR: In this paper, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation is proposed, where a multibranch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder are introduced.
read more
Abstract: Autonomous robotic systems and self driving cars rely on accurate perception of their surroundings as the safety of the passengers and pedestrians is the top priority. Semantic segmentation is one of the essential components of road scene perception that provides semantic information of the surrounding environment. Recently, several methods have been introduced for 3D LiDAR semantic segmentation. While they can lead to improved performance, they are either afflicted by high computational complexity, therefore are inefficient, or they lack fine details of smaller object instances. To alleviate these problems, we propose (AF)2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation. We present a novel multibranch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder. Our (AF)2-S3Net fuses the voxel-based learning and point-based learning methods into a unified framework to effectively process the potentially large 3D scene. Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale nuScenes-lidarseg and SemanticKITTI benchmark, ranking 1st on both competitive public leaderboard competitions upon publication.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
CMDFusion: Bidirectional Fusion Network With Cross-Modality Knowledge Distillation for LiDAR Semantic Segmentation
Jun Cen,Shiwei Zhang,Yixuan Pei,Kun Li,Hongjie Zheng,Min Luo,Yingya Zhang,Qifeng Chen +7 more
TL;DR: CMDFusion achieves the best performance among all fusion-based methods on LiDAR semantic segmentation tasks by explicitly and implicitly enhancing 3D features and distillating 2D knowledge from a 2D network to a 3D network.
6
Advancements in point cloud-based 3D defect classification and segmentation for industrial systems: A comprehensive survey
Anju Rani,Daniel Ortíz-Arroyo,Petar Durdevic +2 more
6
LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using Multi-Scale Convolution Attention
TL;DR: Li et al. as mentioned in this paper proposed a lightweight and efficient projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based segmentation, which is composed of a novel multi-scale convolutional attention module with varying receptive field sizes to capture features.
6
CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation
Jun Cen,Yixuan Pei,Hang Zheng,Yingya Zhang,Qifeng Chen +4 more
- 09 Jul 2023
TL;DR: In this paper , a bidirectional fusion network with cross-modality knowledge distillation (CMDFusion) is proposed to enhance the 3D feature via 2D to 3D fusion and 3D-to-2D fusion, respectively.
Projection-Based Point Convolution for Efficient Point Cloud Segmentation
TL;DR: Projection-based Point Convolution (PPConv), a point convolutional module that uses 2D convolutions and multi-layer perceptrons (MLPs) as its components, achieves superior efficiency compared to state-of-the-art methods, even with a simple architecture based on PointNet++.
References
Squeeze-and-Excitation Networks
Jie Hu,Li Shen,Samuel Albanie,Gang Sun,Enhua Wu +4 more
- 18 Jun 2018
TL;DR: This work proposes a novel architectural unit, which is term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and finds that SE blocks produce significant performance improvements for existing state-of-the-art deep architectures at minimal additional computational cost.
•Posted Content
Squeeze-and-Excitation Networks
TL;DR: Squeeze-and-excitation (SE) as mentioned in this paper adaptively recalibrates channel-wise feature responses by explicitly modeling interdependencies between channels, which can be stacked together to form SENet architectures.
18.9K
•Posted Content
YOLOv3: An Incremental Improvement.
Joseph Redmon,Ali Farhadi +1 more
TL;DR: The authors present some updates to YOLO!
17.8K
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger,Philip Lenz,Raquel Urtasun +2 more
- 16 Jun 2012
TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
R. Qi Charles,Hao Su,Mo Kaichun,Leonidas J. Guibas +3 more
- 21 Jul 2017
TL;DR: This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.