TL;DR: A physics-informed neural network for cardiac activation mapping that accounts for the underlying wave propagation dynamics and quantifies the epistemic uncertainty associated with these predictions to open the door toward physics-based electro-anatomic mapping.
Abstract: A critical procedure in diagnosing atrial fibrillation is the creation of electro-anatomic activation maps. Current methods generate these mappings from interpolation using a few sparse data points recorded inside the atria; they neither include prior knowledge of the underlying physics nor uncertainty of these recordings. Here we propose a physics-informed neural network for cardiac activation mapping that accounts for the underlying wave propagation dynamics and we quantify the epistemic uncertainty associated with these predictions. These uncertainty estimates not only allow us to quantify the predictive error of the neural network, but also help to reduce it by judiciously selecting new informative measurement locations via active learning. We illustrate the potential of our approach using a synthetic benchmark problem and a personalized electrophysiology model of the left atrium. We show that our new method outperforms linear interpolation and Gaussian process regression for the benchmark problem and linear interpolation at clinical densities for the left atrium. In both cases, the active learning algorithm achieves lower error levels than random allocation. Our findings open the door towards physics-based electro-anatomic mapping with the ultimate goals to reduce procedural time and improve diagnostic predictability for patients affected by atrial fibrillation. Open source code is available at https://github.com/fsahli/EikonalNet.
TL;DR: A new warping module named Adaptive Collaboration of Flows (AdaCoF), which estimates both kernel weights and offset vectors for each target pixel to synthesize the output frame and introduces dual-frame adversarial loss which is applicable only to video frame interpolation tasks.
Abstract: Video frame interpolation is one of the most challenging tasks in video processing research. Recently, many studies based on deep learning have been suggested. Most of these methods focus on finding locations with useful information to estimate each output pixel using their own frame warping operations. However, many of them have Degrees of Freedom (DoF) limitations and fail to deal with the complex motions found in real world videos. To solve this problem, we propose a new warping module named Adaptive Collaboration of Flows (AdaCoF). Our method estimates both kernel weights and offset vectors for each target pixel to synthesize the output frame. AdaCoF is one of the most generalized warping modules compared to other approaches, and covers most of them as special cases of it. Therefore, it can deal with a significantly wide domain of complex motions. To further improve our framework and synthesize more realistic outputs, we introduce dual-frame adversarial loss which is applicable only to video frame interpolation tasks. The experimental results show that our method outperforms the state-of-the-art methods for both fixed training set environments and the Middlebury benchmark. Our source code is available at https://github.com/HyeongminLEE/AdaCoF-pytorch
TL;DR: In this article, the authors propose softmax splatting to solve the conflict of mapping multiple source pixels to the same target location in a differentiable way by using a synthesis network to predict the interpolation result from the warped representations.
Abstract: Differentiable image sampling in the form of backward warping has seen broad adoption in tasks like depth estimation and optical flow prediction. In contrast, how to perform forward warping has seen less attention, partly due to additional challenges such as resolving the conflict of mapping multiple pixels to the same target location in a differentiable way. We propose softmax splatting to address this paradigm shift and show its effectiveness on the application of frame interpolation. Specifically, given two input frames, we forward-warp the frames and their feature pyramid representations based on an optical flow estimate using softmax splatting. In doing so, the softmax splatting seamlessly handles cases where multiple source pixels map to the same target location. We then use a synthesis network to predict the interpolation result from the warped representations. Our softmax splatting allows us to not only interpolate frames at an arbitrary time but also to fine tune the feature pyramid and the optical flow. We show that our synthesis approach, empowered by softmax splatting, achieves new state-of-the-art results for video frame interpolation.
TL;DR: A one-stage space-time video super-resolution framework is proposed, which directly synthesizes an HR slow-motion video from an LFR, LR video and proposes a deformable ConvLSTM to align and aggregate temporal information simultaneously for better leveraging global temporal contexts.
Abstract: In this paper, we explore the space-time video super-resolution task, which aims to generate a high-resolution (HR) slow-motion video from a low frame rate (LFR), low-resolution (LR) video. A simple solution is to split it into two sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR). However, temporal interpolation and spatial super-resolution are intra-related in this task. Two-stage methods cannot fully take advantage of the natural property. In addition, state-of-the-art VFI or VSR networks require a large frame-synthesis or reconstruction module for predicting high-quality video frames, which makes the two-stage methods have large model sizes and thus be time-consuming. To overcome the problems, we propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video. Rather than synthesizing missing LR video frames as VFI networks do, we firstly temporally interpolate LR frame features in missing LR video frames capturing local temporal contexts by the proposed feature temporal interpolation network. Then, we propose a deformable ConvLSTM to align and aggregate temporal information simultaneously for better leveraging global temporal contexts. Finally, a deep reconstruction network is adopted to predict HR slow-motion video frames. Extensive experiments on benchmark datasets demonstrate that the proposed method not only achieves better quantitative and qualitative performance but also is more than three times faster than recent two-stage state-of-the-art methods, e.g., DAIN+EDVR and DAIN+RBPN.
TL;DR: This work forward-warps the frames and their feature pyramid representations based on an optical flow estimate using softmax splatting and uses a synthesis network to predict the interpolation result from the warped representations.
Abstract: Differentiable image sampling in the form of backward warping has seen broad adoption in tasks like depth estimation and optical flow prediction. In contrast, how to perform forward warping has seen less attention, partly due to additional challenges such as resolving the conflict of mapping multiple pixels to the same target location in a differentiable way. We propose softmax splatting to address this paradigm shift and show its effectiveness on the application of frame interpolation. Specifically, given two input frames, we forward-warp the frames and their feature pyramid representations based on an optical flow estimate using softmax splatting. In doing so, the softmax splatting seamlessly handles cases where multiple source pixels map to the same target location. We then use a synthesis network to predict the interpolation result from the warped representations. Our softmax splatting allows us to not only interpolate frames at an arbitrary time but also to fine tune the feature pyramid and the optical flow. We show that our synthesis approach, empowered by softmax splatting, achieves new state-of-the-art results for video frame interpolation.
TL;DR: RFSI was substantially faster than RFsp and had similar performance as inverse distance weighting and RFsp in the precipitation and temperature case studies, while kriging was the most accurate technique in the synthetic case study.
Abstract: For many decades, kriging and deterministic interpolation techniques, such as inverse distance weighting and nearest neighbour interpolation, have been the most popular spatial interpolation techniques. Kriging with external drift and regression kriging have become basic techniques that benefit both from spatial autocorrelation and covariate information. More recently, machine learning techniques, such as random forest and gradient boosting, have become increasingly popular and are now often used for spatial interpolation. Some attempts have been made to explicitly take the spatial component into account in machine learning, but so far, none of these approaches have taken the natural route of incorporating the nearest observations and their distances to the prediction location as covariates. In this research, we explored the value of including observations at the nearest locations and their distances from the prediction location by introducing Random Forest Spatial Interpolation (RFSI). We compared RFSI with deterministic interpolation methods, ordinary kriging, regression kriging, Random Forest and Random Forest for spatial prediction (RFsp) in three case studies. The first case study made use of synthetic data, i.e., simulations from normally distributed stationary random fields with a known semivariogram, for which ordinary kriging is known to be optimal. The second and third case studies evaluated the performance of the various interpolation methods using daily precipitation data for the 2016–2018 period in Catalonia, Spain, and mean daily temperature for the year 2008 in Croatia. Results of the synthetic case study showed that RFSI outperformed most simple deterministic interpolation techniques and had similar performance as inverse distance weighting and RFsp. As expected, kriging was the most accurate technique in the synthetic case study. In the precipitation and temperature case studies, RFSI mostly outperformed regression kriging, inverse distance weighting, random forest, and RFsp. Moreover, RFSI was substantially faster than RFsp, particularly when the training dataset was large and high-resolution prediction maps were made.
TL;DR: It is proved that the PointMixup finds the shortest path between two point clouds and that the interpolation is assignment invariant and linear, which allows to introduce strong interpolation-based regularizers such as mixup and manifold mixup to the point cloud domain.
Abstract: This paper introduces data augmentation for point clouds by interpolation between examples. Data augmentation by interpolation has shown to be a simple and effective approach in the image domain. Such a mixup is however not directly transferable to point clouds, as we do not have a one-to-one correspondence between the points of two different objects. In this paper, we define data augmentation between point clouds as a shortest path linear interpolation. To that end, we introduce PointMixup, an interpolation method that generates new examples through an optimal assignment of the path function between two point clouds. We prove that our PointMixup finds the shortest path between two point clouds and that the interpolation is assignment invariant and linear. With the definition of interpolation, PointMixup allows to introduce strong interpolation-based regularizers such as mixup and manifold mixup to the point cloud domain. Experimentally, we show the potential of PointMixup for point cloud classification, especially when examples are scarce, as well as increased robustness to noise and geometric transformations to points. The code for PointMixup and the experimental details are publicly available (Code is available at: https://github.com/yunlu-chen/PointMixup/).
TL;DR: Heum et al. as discussed by the authors proposed a novel deep learning-based video interpolation algorithm based on bilateral motion estimation, which combines the warped frames using the dynamic blending filters to generate intermediate frames.
Abstract: Video interpolation increases the temporal resolution of a video sequence by synthesizing intermediate frames between two consecutive frames. We propose a novel deep-learning-based video interpolation algorithm based on bilateral motion estimation. First, we develop the bilateral motion network with the bilateral cost volume to estimate bilateral motions accurately. Then, we approximate bi-directional motions to predict a different kind of bilateral motions. We then warp the two input frames using the estimated bilateral motions. Next, we develop the dynamic filter generation network to yield dynamic blending filters. Finally, we combine the warped frames using the dynamic blending filters to generate intermediate frames. Experimental results show that the proposed algorithm outperforms the state-of-the-art video interpolation algorithms on several benchmark datasets. The source codes and pre-trained models are available at https://github.com/JunHeum/BMBC.
TL;DR: Biomedisa can drastically reduce both the time and human effort required to segment large images, and achieves a significant improvement over the conventional approach of densely pre-segmented slices with subsequent morphological interpolation as well as compared to segmentation tools that also consider the underlying image data.
Abstract: We present Biomedisa, a free and easy-to-use open-source online platform developed for semi-automatic segmentation of large volumetric images. The segmentation is based on a smart interpolation of sparsely pre-segmented slices taking into account the complete underlying image data. Biomedisa is particularly valuable when little a priori knowledge is available, e.g. for the dense annotation of the training data for a deep neural network. The platform is accessible through a web browser and requires no complex and tedious configuration of software and model parameters, thus addressing the needs of scientists without substantial computational expertise. We demonstrate that Biomedisa can drastically reduce both the time and human effort required to segment large images. It achieves a significant improvement over the conventional approach of densely pre-segmented slices with subsequent morphological interpolation as well as compared to segmentation tools that also consider the underlying image data. Biomedisa can be used for different 3D imaging modalities and various biomedical applications.
TL;DR: The EBK variant of empirical Bayesian kriging (EBK) is a fast and reliable solution for both automatic and interactive data interpolation and can be used for interpolation of very large datasets up to billions of points.
TL;DR: It is shown that the fundamental generalization (mean-squared) error of any interpolating solution in the presence of noise decays to zero with the number of features, and overparameterization can be beneficial in ensuring harmless interpolation of noise.
Abstract: A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We lower-bound the fundamental generalization (mean-squared) error of any interpolating solution in the presence of noise, and show that this bound decays to zero with the number of features. Thus, overparameterization can be beneficial in ensuring harmless interpolation of noise. We discuss two root causes for poor generalization that are complementary in nature – signal “bleeding” into a large number of alias features, and overfitting of noise by parsimonious feature selectors. For the sparse linear model with noise, we provide a hybrid interpolating scheme that mitigates both these issues and achieves order-optimal MSE over all possible interpolating solutions.
TL;DR: This work develops the bilateral motion network with the bilateral cost volume to estimate bilateral motions accurately, then approximate bi-directional motions to predict a different kind of bilateral motions, and warp the two input frames using the estimated bilateral motions.
Abstract: Video interpolation increases the temporal resolution of a video sequence by synthesizing intermediate frames between two consecutive frames. We propose a novel deep-learning-based video interpolation algorithm based on bilateral motion estimation. First, we develop the bilateral motion network with the bilateral cost volume to estimate bilateral motions accurately. Then, we approximate bi-directional motions to predict a different kind of bilateral motions. We then warp the two input frames using the estimated bilateral motions. Next, we develop the dynamic filter generation network to yield dynamic blending filters. Finally, we combine the warped frames using the dynamic blending filters to generate intermediate frames. Experimental results show that the proposed algorithm outperforms the state-of-the-art video interpolation algorithms on several benchmark datasets.
TL;DR: An effective event-driven video deblurring and interpolation algorithm based on deep convolutional neural networks (CNNs) that achieves superior performance against state-ofthe-art methods on both synthetic and real datasets is proposed.
Abstract: Event-based sensors, which have a response if the change of pixel intensity exceeds a triggering threshold, can capture high-speed motion with microsecond accuracy. Assisted by an event camera, we can generate high frame-rate sharp videos from low frame-rate blurry ones captured by an intensity camera. In this paper, we propose an effective event-driven video deblurring and interpolation algorithm based on deep convolutional neural networks (CNNs). Motivated by the physical model that the residuals between a blurry image and sharp frames are the integrals of events, the proposed network uses events to estimate the residuals for the sharp frame restoration. As the triggering threshold varies spatially, we develop an effective method to estimate dynamic filters to solve this problem. To utilize the temporal information, the sharp frames restored from the previous blurry frame are also considered. The proposed algorithm achieves superior performance against state-of-the-art methods on both synthetic and real datasets.
TL;DR: The proposed architecture makes use of 3D space-time convolutions to enable end to end learning and inference for the task of video frame interpolation and can serve as a useful self-supervised pretext task for action recognition, optical flow estimation, and motion magnification.
Abstract: A majority of methods for video frame interpolation compute bidirectional optical flow between adjacent frames of a video, followed by a suitable warping algorithm to generate the output frames. However, approaches relying on optical flow often fail to model occlusions and complex non-linear motions directly from the video and introduce additional bottlenecks unsuitable for widespread deployment. We address these limitations with FLAVR, a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video frame interpolation. Our method efficiently learns to reason about non-linear motions, complex occlusions and temporal abstractions, resulting in improved performance on video interpolation, while requiring no additional inputs in the form of optical flow or depth maps. Due to its simplicity, FLAVR can deliver 3x faster inference speed compared to the current most accurate method on multi-frame interpolation without losing interpolation accuracy. In addition, we evaluate FLAVR on a wide range of challenging settings and consistently demonstrate superior qualitative and quantitative results compared with prior methods on various popular benchmarks including Vimeo-90K, UCF101, DAVIS, Adobe, and GoPro. Finally, we demonstrate that FLAVR for video frame interpolation can serve as a useful self-supervised pretext task for action recognition, optical flow estimation, and motion magnification.
TL;DR: An efficient approach for estimating the directions-of-arrival (DOAs) of coherent signals using coprime arrays usingCoprime array interpolation is devised, which defines a Toeplitz matrix that is formed using the correlation information of the interpolated ULA theoretical outputs, and is recovered by solving a nuclear norm minimization problem.
Abstract: In this letter, we devise an efficient approach for estimating the directions-of-arrival (DOAs) of coherent signals using coprime arrays. Specifically, we firstly derive an augmented uniform linear array (ULA) via coprime array interpolation. Subsequently, we define a Toeplitz matrix that is formed using the correlation information of the interpolated ULA theoretical outputs, and recover the Toeplitz matrix by solving a nuclear norm minimization problem. After the low-rank Toeplitz matrix is recovered, the coherent signals are well resolved by the MUSIC algorithm. Simulation results demonstrate the advantages of the proposed approach over various existing methods when dealing with coherent signals.
TL;DR: This work has performed seismic trace interpolation by using the convolutional autoencoder (CAE) to solve the problem of rare complete shot gathers in field data applications, and the trained network on synthetic data is used as an initialization of the network training on field data, called the transfer learning strategy.
Abstract: Seismic trace interpolation is an important technique because irregular or insufficient sampling data along the spatial direction may lead to inevitable errors in multiple suppression, imag...
TL;DR: In this article, the authors consider the estimation of a signal from the knowledge of its noisy linear random Gaussian projections and show that approximate message-passing always reaches the minimal-mean-square error.
Abstract: We consider the estimation of a signal from the knowledge of its noisy linear random Gaussian projections. A few examples where this problem is relevant are compressed sensing, sparse superposition codes, and code division multiple access. There has been a number of works considering the mutual information for this problem using the replica method from statistical physics. Here we put these considerations on a firm rigorous basis. First, we show, using a Guerra-Toninelli type interpolation, that the replica formula yields an upper bound to the exact mutual information. Secondly, for many relevant practical cases, we present a converse lower bound via a method that uses spatial coupling, state evolution analysis and the I-MMSE theorem. This yields a single letter formula for the mutual information and the minimal-mean-square error for random Gaussian linear estimation of all discrete bounded signals. In addition, we prove that the low complexity approximate message-passing algorithm is optimal outside of the so-called hard phase, in the sense that it asymptotically reaches the minimal-mean-square error. In this work spatial coupling is used primarily as a proof technique. However our results also prove two important features of spatially coupled noisy linear random Gaussian estimation. First there is no algorithmically hard phase. This means that for such systems approximate message-passing always reaches the minimal-mean-square error. Secondly, in the limit of infinitely long coupled chain, the mutual information associated to spatially coupled systems is the same as the one of uncoupled linear random Gaussian estimation.
TL;DR: An approach to anomalous detection is proposed in which the model utilizes multiple frames of a spectrogram whose center frame is removed as an input, and it predicts an interpolation of the removed frame as an output.
Abstract: As the labor force decreases, the demand for labor-saving automatic anomalous sound detection technology that conducts maintenance of industrial equipment has grown. Conventional approaches detect anomalies based on the reconstruction errors of an autoencoder. However, when the target machine sound is non-stationary, a reconstruction error tends to be large independent of an anomaly, and its variations increased because of the difficulty of predicting the edge frames. To solve the issue, we propose an approach to anomalous detection in which the model utilizes multiple frames of a spectrogram whose center frame is removed as an input, and it predicts an interpolation of the removed frame as an output. Rather than predicting the edge frames, the proposed approach makes the reconstruction error consistent with the anomaly. Experimental results showed that the proposed approach achieved 27% improvement based on the standard AUC score, especially against non-stationary machinery sounds.
TL;DR: This work established a model architecture with randomly sampled data as input and corresponding complete data as output, which was based on an encoder-decoder-style U-Net convolutional neural network and successfully applied the model to regularly missing data reconstruction, although it was trained with irregularly sampled data only.
Abstract: Deep learning (DL) is a powerful tool for mining features from data, which can theoretically avoid assumptions (e.g., linear events) constraining conventional interpolation methods. Motivated by this and inspired by image-to-image translation, we applied DL to irregularly and regularly missing data reconstruction with the aim of transforming incomplete data into corresponding complete data. To accomplish this, we established a model architecture with randomly sampled data as input and corresponding complete data as output, which was based on an encoder-decoder-style U-Net convolutional neural network. We carefully prepared the training data using synthetic and field seismic data. We used a mean-squared-error loss function and an Adam optimizer to train the network. We displayed the feature maps for a randomly sampled data set going through the trained model with the aim of explaining how the missing data are reconstructed. We benchmarked the method on several typical datasets for irregularly missing data reconstruction, which achieved better performances compared with a peer-reviewed Fourier transform interpolation method, verifying the effectiveness, superiority, and generalization capability of our approach. Because regularly missing is a special case of irregularly missing, we successfully applied the model to regularly missing data reconstruction, although it was trained with irregularly sampled data only.
TL;DR: A quantitative theory for the double descent of test error in the so-called lazy learning regime of neural networks is developed by considering the problem of learning a high-dimensional function with random features regression, and it is shown that the bias displays a phase transition at the interpolation threshold, beyond which it remains constant.
Abstract: Deep neural networks can achieve remarkable generalization performances while interpolating the training data perfectly. Rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a "double descent" - a mark of the beneficial role of overparametrization. In this work, we develop a quantitative theory for this phenomenon in the so-called lazy learning regime of neural networks, by considering the problem of learning a high-dimensional function with random features regression. We obtain a precise asymptotic expression for the bias-variance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond which it remains constant. We disentangle the variances stemming from the sampling of the dataset, from the additive noise corrupting the labels, and from the initialization of the weights. Following up on Geiger et al. 2019, we first show that the latter two contributions are the crux of the double descent: they lead to the overfitting peak at the interpolation threshold and to the decay of the test error upon overparametrization. We then quantify how they are suppressed by ensemble averaging the outputs of K independently initialized estimators. When K is sent to infinity, the test error remains constant beyond the interpolation threshold. We further compare the effects of overparametrizing, ensembling and regularizing. Finally, we present numerical experiments on classic deep learning setups to show that our results hold qualitatively in realistic lazy learning scenarios.
TL;DR: In this paper, an interpolation method based on the denoising convolutional neural network (CNN) for seismic data was developed for a simple and efficient way to break through the problem of seismic data interpolation.
Abstract: We have developed an interpolation method based on the denoising convolutional neural network (CNN) for seismic data. It provides a simple and efficient way to break through the problem of ...
TL;DR: This paper presents the optimization of the inverse distance weighting method (IDW) in the process of creating a digital terrain model (DTM) of the seabed based on bathymetric data collected using a multibeam echosounder (MBES).
Abstract: This paper presents the optimization of the inverse distance weighting method (IDW) in the process of creating a digital terrain model (DTM) of the seabed based on bathymetric data collected using a multibeam echosounder (MBES). There are many different methods for processing irregular measurement data into a grid-based DTM, and the most popular of these methods are inverse distance weighting (IDW), nearest neighbour (NN), moving average (MA) and kriging (K). Kriging is often considered one of the best methods in interpolation of heterogeneous spatial data, but its use is burdened by a significantly long calculation time. In contrast, the MA method is the fastest, but the calculated models are less accurate. Between them is the IDW method, which gives satisfactory accuracy with a reasonable calculation time. In this study, the author optimized the IDW method used in the process of creating a DTM seabed based on measurement points from MBES. The goal of this optimization was to significantly accelerate the calculations, with a possible additional increase in the accuracy of the created model. Several variants of IDW methods were analysed (dependent on the search radius, number of points in the interpolation, power of the interpolation and applied smoothing method). Finally, the author proposed an optimization of the IDW method, which uses a new technique of choosing the nearest points during the interpolation process (named the growing radius). The experiments presented in the paper and the results obtained show the true potential of the IDW optimized method in the case of DTM estimation.
TL;DR: Results show that proposed evolutionary paradigm CSM-GASQP is an effective, alternate, accurate, and reliable stochastic numerical solver for stiff nonlinear singular Thomas–Fermi systems.
Abstract: In the present work, a new stochastic computing technique based on evolutionary cubic spline method (CSM) is introduced for solving nonlinear singular Thomas–Fermi system arising in atomic physics. The concept of cubic splines interpolation is engaged with an evolutionary optimization technique based on genetic algorithms (GAs) hybrid with sequential quadratic programming (SQP) to develop a proposed methodology, CSM-GASQP, that can solve nonlinear differential equations, and GA produces the optimized value for the coefficients of cubic splines, while SQP is used for rapid local refinements. The developed method CSM-GASQP for different lengths of the splines is applied effectively to solve the Thomas–Fermi equation for number of scenarios. Results show that proposed evolutionary paradigm CSM-GASQP is an effective, alternate, accurate, and reliable stochastic numerical solver for stiff nonlinear singular Thomas–Fermi systems.
TL;DR: This paper proposed to compute features only at sparsely sampled locations, which are probabilistically chosen according to activation responses, and then densely reconstruct the feature map with an efficient interpolation procedure.
Abstract: In the feature maps of CNNs, there commonly exists considerable spatial redundancy that leads to much repetitive processing. Towards reducing this superfluous computation, we propose to compute features only at sparsely sampled locations, which are probabilistically chosen according to activation responses, and then densely reconstruct the feature map with an efficient interpolation procedure. With this sampling-interpolation scheme, our network avoids expending computation on spatial locations that can be effectively interpolated, while being robust to activation prediction errors through broadly distributed sampling. A technical challenge of this sampling-based approach is that the binary decision variables for representing discrete sampling locations are non-differentiable, making them incompatible with backpropagation. To circumvent this issue, we make use of a reparameterization trick based on the Gumbel-Softmax distribution, with which backpropagation can iterate these variables towards binary values. The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
TL;DR: The results indicate that the multiscale cellular structures obtained by the proposed method show higher natural frequency compared with the monoscale macrostructural and microstructural designs.
TL;DR: A time-series data augmentation method based on interpolation that is robust against the impairment of trend information of the original time- series and has the advantage of not high complexity is proposed.
TL;DR: This paper proposes a novel approach that is referred to as enhanced deformable separable convolution (EDSC) to estimate not only adaptive kernels, but also offsets, masks and biases to make the network obtain information from non-local neighborhood.
Abstract: Generating non-existing frames from a consecutive video sequence has been an interesting and challenging problem in the video processing field Typical kernel-based interpolation methods predict pixels with a single convolution process that convolves source frames with spatially adaptive local kernels, which circumvents the time-consuming, explicit motion estimation in the form of optical flow However, when scene motion is larger than the pre-defined kernel size, these methods are prone to yield less plausible results In addition, they cannot directly generate a frame at an arbitrary temporal position because the learned kernels are tied to the midpoint in time between the input frames In this paper, we try to solve these problems and propose a novel non-flow kernel-based approach that we refer to as enhanced deformable separable convolution (EDSC) to estimate not only adaptive kernels, but also offsets, masks and biases to make the network obtain information from non-local neighborhood During the learning process, different intermediate time step can be involved as a control variable by means of an extension of coord-conv trick, allowing the estimated components to vary with different input temporal information This makes our method capable to produce multiple in-between frames Furthermore, we investigate the relationships between our method and other typical kernel- and flow-based methods Experimental results show that our method performs favorably against the state-of-the-art methods across a broad range of datasets Code will be publicly available on URL: \url{this https URL}
TL;DR: This work investigates the stability of (discrete) empirical interpolation for nonlinear model reduction and state field approximation from measurements and presents a deterministic sampling strategy that aims to achieve lower approximation errors with fewer points than randomized sampling by taking information about the low-dimensional spaces into account.
Abstract: This work investigates the stability of (discrete) empirical interpolation for nonlinear model reduction and state field approximation from measurements. Empirical interpolation derives approximati...
TL;DR: Wang et al. as discussed by the authors proposed a pyramid module to cyclically synthesize clear intermediate frames, and an inter-pyramid recurrent module to connect sequential models to exploit the temporal relationship.
Abstract: Existing works reduce motion blur and up-convert frame rate through two separate ways, including frame deblurring and frame interpolation. However, few studies have approached the joint video enhancement problem, namely synthesizing high-frame-rate clear results from low-frame-rate blurry inputs. In this paper, we propose a blurry video frame interpolation method to reduce motion blur and up-convert frame rate simultaneously. Specifically, we develop a pyramid module to cyclically synthesize clear intermediate frames. The pyramid module features adjustable spatial receptive field and temporal scope, thus contributing to controllable computational complexity and restoration ability. Besides, we propose an inter-pyramid recurrent module to connect sequential models to exploit the temporal relationship. The pyramid module integrates a recurrent module, thus can iteratively synthesize temporally smooth results without significantly increasing the model size. Extensive experimental results demonstrate that our method performs favorably against state-of-the-art methods. The source code and pre-trained model are available at https://github.com/laomao0/BIN.
TL;DR: The key idea to make this workable is a NN that already knows the "basic tricks" of graphics in a hard-coded and differentiable form, leading to a compact set of trainable parameters and hence real-time navigation in view, time and illumination.
Abstract: We suggest to represent an X-Field ---a set of 2D images taken across different view, time or illumination conditions, i.e., video, lightfield, reflectance fields or combinations thereof---by learning a neural network (NN) to map their view, time or light coordinates to 2D images. Executing this NN at new coordinates results in joint view, time or light interpolation. The key idea to make this workable is a NN that already knows the "basic tricks" of graphics (lighting, 3D projection, occlusion) in a hard-coded and differentiable form. The NN represents the input to that rendering as an implicit map, that for any view, time, or light coordinate and for any pixel can quantify how it will move if view, time or light coordinates change (Jacobian of pixel position with respect to view, time, illumination, etc.). Our X-Field representation is trained for one scene within minutes, leading to a compact set of trainable parameters and hence real-time navigation in view, time and illumination.