Enhancing Event-based Structured Light Imaging with a Single Frame

doi:10.1109/MFI55806.2022.9913845

Proceedings Article10.1109/MFI55806.2022.9913845

Enhancing Event-based Structured Light Imaging with a Single Frame

Huijiao Wang, +5 more

- 20 Sep 2022

pp 1-7

1

TL;DR: A Multi-Modal Feature Fusion Network (MFFN) consisting of a feature fusion module and an upscale module to simultaneously fuse events and a single intensity frame, suppress event perturbations, and reconstruct a high-quality depth surface is proposed.

Abstract: Benefiting from the extremely low latency, events have been used for Structured Light Imaging (SLI) to predict the depth surface. However, existing methods only focus on improving scanning speeds but neglect perturbations from event noise and timestamp jittering for depth estimation. In this paper, we build a hybrid SLI system equipped with an event camera, a high-resolution frame camera, and a digital light projector, where a single intensity frame is adopted as a guidance to enhance the event-based SLI quality. To achieve this end, we propose a Multi-Modal Feature Fusion Network (MFFN) consisting of a feature fusion module and an upscale module to simultaneously fuse events and a single intensity frame, suppress event perturbations, and reconstruct a high-quality depth surface. Further, for training MFFN, we build a new Structured Light Imaging based on Event and Frame cameras (EF-SLI) dataset collected from the hybrid SLI system, containing paired inputs composed of a set of synchronized events and one single corresponding frame, and ground-truth references obtained by a high-quality SLI approach. Experiments demonstrate that our proposed MFFN outperforms state-of-the-art event-based SLI approaches in terms of accuracy at different scanning speeds.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arxiv.2403.07326

SGE: Structured Light System Based on Gray Code with an Event Camera

Xingyu Lu, +4 more

- 12 Mar 2024

- arXiv.org

TL;DR: High-speed structured light system based on Gray code and event camera achieves accurate depth estimation with minimal data redundancy and high acquisition speed.

...read moreread less

References

•Proceedings Article•10.1109/CVPR.2016.207

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

Wenzhe Shi, +7 more

- 27 Jun 2016

TL;DR: This paper presents the first convolutional neural network capable of real-time SR of 1080p videos on a single K2 GPU and introduces an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output.

...read moreread less

7.5K

Journal Article•10.1117/1.602438

Overview of three-dimensional shape measurement using optical methods

Frank Chen, +2 more

- 01 Jan 2000

- Optical Engineering

TL;DR: An overview of 3-D shape measurement using various optical methods, and a focus on structured light tech- niques where various optical configurations, image acquisition technology, data postprocessing and analysis methods and advantages and limitations are presented.

...read moreread less

1.6K

Journal Article•10.1109/TPAMI.2016.2574707

HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition

Xavier Lagorce, +4 more

- 01 Jul 2017

- IEEE Transactions on Pattern Analysis an...

TL;DR: The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood and it is demonstrated that this concept can robustly be used at all stages of an event-based hierarchical model.

...read moreread less

653

•Proceedings Article•10.1109/CVPR42600.2020.01271

What Makes Training Multi-Modal Classification Networks Hard?

Weiyao Wang, +2 more

- 14 Jun 2020

TL;DR: This paper identifies two main causes for this performance drop: first, multi-modal networks are often prone to overfitting due to increased capacity and second, different modalities overfit and generalize at different rates, so training them jointly with a single optimization strategy is sub-optimal.

...read moreread less

541

•Posted Content

Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

Wei-Sheng Lai, +3 more

- 04 Oct 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the authors proposed the deep Laplacian pyramid super-resolution network (LAPS-Net), which progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels.

...read moreread less

519

...

Expand