PixelSNAIL: an Improved Autoregressive Generative Model.
Xi Chen,Nikhil Mishra,Mostafa Rohaninejad,Pieter Abbeel +3 more
60
TL;DR: Researchers introduce PixelSNAIL, an improved autoregressive generative model combining causal convolutions with self-attention, achieving state-of-the-art log-likelihood results on CIFAR-10 and ImageNet, outperforming previous models with 2.85 and 3.80 bits per dim, respectively.
read more
Abstract: Autoregressive generative models consistently achieve the best results in density estimation tasks involving high dimensional data, such as images or audio. They pose density estimation as a sequence modeling task, where a recurrent neural network (RNN) models the conditional distribution over the next element conditioned on all previous elements. In this paradigm, the bottleneck is the extent to which the RNN can model long-range dependencies, and the most successful approaches rely on causal convolutions, which offer better access to earlier parts of the sequence than conventional RNNs. Taking inspiration from recent work in meta reinforcement learning, where dealing with long-range dependencies is also essential, we introduce a new generative model architecture that combines causal convolutions with self attention. In this note, we describe the resulting model and present state-of-the-art log-likelihood results on CIFAR-10 (2.85 bits per dim) and $32 \times 32$ ImageNet (3.80 bits per dim). Our implementation is available at this https URL
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Video Diffusion Models
07 Apr 2022
TL;DR: The authors proposed a diffusion model for video generation, which is a natural extension of the standard image diffusion architecture and enables jointly training from image and video data, which they find to reduce the variance of minibatch gradients and speed up optimization.
246
Unsupervised Pansharpening Based on Self-Attention Mechanism
TL;DR: The proposed unsupervised pansharpening method in a deep-learning framework is able to reconstruct sharper MSI of different types, with more details and less spectral distortion compared with the state-of-the-art.
104
An Introduction to Neural Data Compression
TL;DR: This introduction hopes to fill in the necessary background by reviewing basic coding topics such as entropy coding and rate-distortion theory, related machine learning ideas such as bits-back coding and perceptual metrics, and providing a guide through the representative works in the literature so far.
96
ShapeFormer: Transformer-based Shape Completion via Sparse Representation
Xingguang Yan,Liqiang Lin,Niloy J. Mitra,Dani Lischinski,Daniel Cohen-Or,Hui Huang +5 more
- 25 Jan 2022
TL;DR: A compact 3D representation, vector quantized deep implicit function (VQDIF), that utilizes spatial sparsity to represent a close approximation of a 3D shape by a short sequence of discrete variables is introduced.
81
CANet: Co-attention Network for RGB-D Semantic Segmentation
TL;DR: Li et al. as mentioned in this paper propose a co-attention network (CANet) to build sound interaction between RGB and depth features, which includes three modules: position and channel coattention fusion modules adaptively fuse RGB and D features in spatial and channel dimensions.
73
References
Generative adversarial networks
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A generative adversarial networks algorithm designed to solve the generative modeling problem and its applications in medicine, education and robotics are studied.
9.8K
Generating Sentences from a Continuous Space
Samuel R. Bowman,Luke Vilnis,Oriol Vinyals,Andrew M. Dai,Rafal Jozefowicz,Samy Bengio +5 more
- 01 Jan 2016
TL;DR: This work introduces and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences that allows it to explicitly model holistic properties of sentences such as style, topic, and high-level syntactic features.
Eviscerated Corneas as Tissue Source for Ex Vivo Expansion of Limbal Epithelial Cells on Platelet-Rich Plasma Gels.
TL;DR: This study demonstrated that A-PRP is a viable and effective alternative to bovine serum for the ex vivo expansion of limbal epithelial cells and shows that eviscerated corneas are a viable source of donor tissue for this purpose in South Africa where access to tissue banks is limited.
SOPHIE velocimetry of Kepler transit candidates XVII. The physical properties of giant exoplanets within 400 days of period
Alexandre Santerne,C. Moutou,Maria Tsantaki,François Bouchy,Guillaume Hébrard,V. Adibekyan,J. M. Almenara,Louis Amard,Louis Amard,S. C. C. Barros,Isabelle Boisse,Aldo S. Bonomo,G. Bruno,B. Courcol,Magali Deleuil,Olivier Demangeon,Rodrigo F. Díaz,Tristan Guillot,M. Havel,Guillaume Montagnier,A. S. Rajpurohit,J. Rey,Nuno C. Santos +22 more
TL;DR: In this article, a sample of giant transiting exoplanets detected by the Kepler telescope with orbital periods up to 400 days was used to reveal the nature of these candidates and measure a false positive rate of 54.6 +/- 6.5% for giant-planet candidates orbiting within 400 days of period.
DRAW: A Recurrent Neural Network for Image Generation.
Karol Gregor,Ivo Danihelka,Alex Graves,Danilo Jimenez Rezende,Daan Wierstra +4 more
TL;DR: This paper introduces DRAW, a recurrent neural network for image generation, combining spatial attention and sequential variational auto-encoding to iteratively construct complex images, outperforming state-of-the-art generative models on MNIST and Street View House Numbers datasets.