Shape-aware Text-driven Layered Video Editing

doi:10.48550/arXiv.2301.13173

Journal Article10.48550/arXiv.2301.13173

Shape-aware Text-driven Layered Video Editing

Yao-Chih Lee, +3 more

- 30 Jan 2023

- arXiv.org

- Vol. abs/2301.13173

30

TL;DR: The authors propagate the deformation field between the input and edited keyframe to all frames and leverage a pre-trained text-conditioned diffusion model as guidance for refining shape distortion and completing unseen regions.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2303.13439

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

Levon Khachatryan, +6 more

- 23 Mar 2023

- arXiv.org

TL;DR: Text2Video-Zero as discussed by the authors proposes a low-cost approach by leveraging the power of existing text-to-image synthesis methods (e.g., Stable Diffusion), making them suitable for the video domain.

...read moreread less

290

Journal Article•10.48550/arXiv.2303.09535

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

Chenyang Qi, +6 more

- 16 Mar 2023

- arXiv.org

TL;DR: FateZero as mentioned in this paper proposes a zero-shot text-based editing method on real-world videos without per-prompt training or use-specific mask, which captures intermediate attention maps during inversion, which effectively retain both structural and motion information.

...read moreread less

182

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

19 Jul 2023

TL;DR: In this article , a text-to-image diffusion model is proposed to generate a high-quality video that adheres to the target text, while preserving the spatial layout and motion of the input video.

...read moreread less

116

Journal Article•10.48550/arxiv.2310.07204

State of the Art on Diffusion Models for Visual Computing

Ryan Po, +17 more

- 11 Oct 2023

- arXiv.org

TL;DR: The basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model, as well as overview important aspects of these generative AI tools, including personalization, conditioning, inversion, among others are introduced.

...read moreread less

63

Journal Article•10.48550/arxiv.2310.10647

A Survey on Video Diffusion Models

Zhen Xing, +7 more

- 16 Oct 2023

- arXiv.org

TL;DR: This paper presents a comprehensive review of video diffusion models in the AIGC era, with a concise introduction to the fundamentals and evolution of diffusion models, and presents an overview of research on diffusion Models in the video domain.

...read moreread less

41

...

Expand

References

•Posted Content

Denoising Diffusion Probabilistic Models

Jonathan Ho, +2 more

- 19 Jun 2020

- arXiv: Learning

TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.

...read moreread less

11.7K

•Proceedings Article

Spatial transformer networks

Max Jaderberg, +3 more

- 07 Dec 2015

TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.

...read moreread less

8.5K

Journal Article•10.1109/34.24792

Principal warps: thin-plate splines and the decomposition of deformations

Fred L. Bookstein

- 01 Jun 1989

- IEEE Transactions on Pattern Analysis an...

TL;DR: The decomposition of deformations by principal warps is demonstrated and the method is extended to deal with curving edges between landmarks to aid the extraction of features for analysis, comparison, and diagnosis of biological and medical images.

...read moreread less

5.5K

•Posted Content

Denoising Diffusion Implicit Models

Jiaming Song, +2 more

- 06 Oct 2020

- arXiv: Learning

TL;DR: Denoising diffusion implicit models (DDIMs) are presented, a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs that can produce high quality samples faster and perform semantically meaningful image interpolation directly in the latent space.

...read moreread less

3.8K

•Proceedings Article•10.1109/ICCV.2017.629

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

Han Zhang, +2 more

- 01 Oct 2017

TL;DR: This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 photo-realistic images conditioned on text descriptions and introduces a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold.

...read moreread less

3.6K

...

Expand