Journal Article10.48550/arxiv.2312.03341
Online Vectorized HD Map Construction using Geometry
Zhixin Zhang,Yiyuan Zhang,Xiaohan Ding,Fusheng Jin,Xiangyu Yue +4 more
12
TL;DR: This work proposes GeMap, which end-to-end learns Euclidean shapes and relations of map instances beyond basic perception and achieves new state-of-the-art performance on the NuScenes and Argoverse 2 datasets.
read more
Abstract: The construction of online vectorized High-Definition (HD) maps is critical for downstream prediction and planning. Recent efforts have built strong baselines for this task, however, shapes and relations of instances in urban road systems are still under-explored, such as parallelism, perpendicular, or rectangle-shape. In our work, we propose GeMap ($\textbf{Ge}$ometry $\textbf{Map}$), which end-to-end learns Euclidean shapes and relations of map instances beyond basic perception. Specifically, we design a geometric loss based on angle and distance clues, which is robust to rigid transformations. We also decouple self-attention to independently handle Euclidean shapes and relations. Our method achieves new state-of-the-art performance on the NuScenes and Argoverse 2 datasets. Remarkably, it reaches a 71.8% mAP on the large-scale Argoverse 2 dataset, outperforming MapTR V2 by +4.4% and surpassing the 70% mAP threshold for the first time. Code is available at https://github.com/cnzzx/GeMap
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding,Yiyuan Zhang,Yixiao Ge,Sijie Zhao,Lin Song,Xiangyu Yue,Ying Shan +6 more
TL;DR: It is discovered that large kernels are the key to unlocking the exceptional performance of ConvNets in domains where they were originally not proficient, and the proposed model achieves state-of-the-art performance on time-series forecasting and audio recognition tasks even without modality-specific customization to the architecture.
66
MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping
Jiacheng Chen,Yuefan Wu,Jiaqi Tan,Hang Ma,Yasutaka Furukawa +4 more
TL;DR: MapTracker is an algorithm for consistent vector HD mapping that uses strided memory fusion to ensure consistent reconstructions over time. It accumulates sensor streams into memory buffers of raster and vector latents, and leverages query propagation to associate tracked road elements from the previous frame to the current frame.
8
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data
Cherie Ho,Jingui Zou,Omar Alama,S. Kumar,Benjamin Chiang,Taneesh Gupta,Chen Wang,Nikhil Varma Keetha,Katia Sycara,Sebastian Scherer +9 more
- 11 Jul 2024
TL;DR: This study introduces Map It Anywhere (MIA), a data engine that leverages large-scale public maps to enable generalizable Bird's Eye View (BEV) map prediction, outperforming baselines by 35% with zero-shot performance, and paving the way for robust autonomous navigation.
LGmap: Local-to-Global Mapping Network for Online Long-Range Vectorized HD Map Construction
Kaihua Wu,Sulei Nian,Chengmin Shen,Yang Chen,Zhanbin Li +4 more
- 20 Jun 2024
TL;DR: LGmap is an online mapping pipeline for constructing long-range HD maps. It utilizes SVT, HTF, and ped-crossing resampling techniques to achieve high stability and accuracy.
LDMapNet-U: An End-to-End System for City-Scale Lane-Level Map Updating
Deguo Xia,Weiming Zhang,Xiyan Liu,Wei Zhang,Chenting Gong,Xiao Tan,Jizhou Huang,Mengmeng Yang,Diange Yang +8 more
- 06 Jan 2025
TL;DR: LDMapNet-U is an end-to-end system for city-scale lane-level map updating, leveraging a Prior-Map Encoding module and Instance Change Prediction module to simultaneously generate vectorized maps and detect changes, significantly reducing update cycles and improving map accuracy.
References
•Posted Content
Focal Loss for Dense Object Detection
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
16.7K
•Posted Content
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan,Quoc V. Le +1 more
TL;DR: A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.
12K
•Posted Content
SGDR: Stochastic Gradient Descent with Warm Restarts
Ilya Loshchilov,Frank Hutter +1 more
TL;DR: In this paper, a simple warm restart technique for stochastic gradient descent was proposed to improve its anytime performance when training deep neural networks, which achieved state-of-the-art results on both the CIFAR-10 and CifAR-100 datasets.
5.9K
•Posted Content
Deformable DETR: Deformable Transformers for End-to-End Object Detection
TL;DR: Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference, can achieve better performance than DETR (especially on small objects) with 10$\times less training epochs.
4.5K
•Posted Content
nuScenes: A multimodal dataset for autonomous driving
Holger Caesar,Varun Bankiti,Alex H. Lang,Sourabh Vora,Venice Erin Liong,Qiang Xu,Anush Krishnan,Yu Pan,Giancarlo Baldan,Oscar Beijbom +9 more
TL;DR: nuScenes as mentioned in this paper is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.
3.7K