Open AccessPosted Content
Learnable Triangulation of Human Pose
TL;DR: Two novel solutions for multi-view 3D human pose estimation based on new learnable triangulation methods that combine 3D information from multiple 2D views are presented and end-to-end differentiable, which allows us to directly optimize the target metric.
read more
Abstract: We present two novel solutions for multi-view 3D human pose estimation based on new learnable triangulation methods that combine 3D information from multiple 2D views. The first (baseline) solution is a basic differentiable algebraic triangulation with an addition of confidence weights estimated from the input images. The second solution is based on a novel method of volumetric aggregation from intermediate 2D backbone feature maps. The aggregated volume is then refined via 3D convolutions that produce final 3D joint heatmaps and allow modelling a human pose prior. Crucially, both approaches are end-to-end differentiable, which allows us to directly optimize the target metric. We demonstrate transferability of the solutions across datasets and considerably improve the multi-view state of the art on the Human3.6M dataset. Video demonstration, annotations and additional materials will be posted on our project page (this https URL).
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image
Gyeongsik Moon,Kyoung Mu Lee +1 more
TL;DR: I2L-MeshNet as discussed by the authors predicts the per-lixel likelihood on 1D heatmaps for each mesh vertex coordinate instead of directly regressing the parameters, which preserves the spatial relationship in the input image and models the prediction uncertainty.
25
•Posted Content
TetraTSDF: 3D human reconstruction from a single image with a tetrahedral outer shell
Hayato Onizuka,Zehra Hayirci,Diego Thomas,Akihiro Sugimoto,Hideaki Uchiyama,Rin-ichiro Taniguchi +5 more
TL;DR: The proposed tetrahedral outer shell volumetric truncated signed distance function (TetraTSDF) model for the human body, and its corresponding part connection network (PCN) for 3D human body shape regression is compact, dense, accurate, and yet well suited for CNN-based regression task.
24
Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications
TL;DR: A fast, unified end-to-end model for estimating 3D human pose, called YOLOv5-HR-TCM (YOLO v5-HRet-Temporal Convolution Model), based on the 2D to 3D lifting approach for 3Dhuman pose estimation while taking care of each step in the estimation process, which achieves high accuracy, not sacrificing processing speed.
Pixels2Pose: Super-resolution time-of-flight imaging for 3D pose estimation
Alice Ruget,Max Tyler,Germán Mora Martín,Stirling Scholes,Feng Zhou,Istvan Gyongy,Brent Hearn,Stephen McLaughlin,Abderrahim Halimi,Jonathan Leach +9 more
TL;DR: In this article , a temporal-to-spatial mapping was proposed to increase the resolution of a simple time-of-flight (TOSF) depth sensor from 4 × 4 pixels to depth images of resolution 32 × 32 pixels.
23
PhaseMP: Robust 3D Pose Estimation via Phase-conditioned Human Motion Prior
Mingyi Shi,Sebastian Starke,Yuting Ye,Taku Komura,Jung Won +4 more
- 01 Oct 2023
TL;DR: A novel motion prior, called PhaseMP, modeling a probability distribution on pose transitions conditioned by a frequency domain feature extracted from a periodic autoencoder, which can be useful for accurately estimating 3D human motions in the presence of challenging input data, as well as noisy sensor measurements.
23
References
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
- 06 Sep 2014
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
•Book
Multiple view geometry in computer vision
Richard Hartley,Andrew Zisserman +1 more
- 01 Jan 2000
TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.
20.1K
Multiple View Geometry in Computer Vision.
Bernhard P. Wrobel
- 01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
14.2K
•Proceedings Article
Spatial transformer networks
Max Jaderberg,Karen Simonyan,Andrew Zisserman,Koray Kavukcuoglu +3 more
- 07 Dec 2015
TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments
TL;DR: A new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, is introduced for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms.