TL;DR: This major upgrade has been fully re-engineered to enhance speed, accuracy and usability with interactive 3D visualization of ENDscript 2 and ESPript 3 to handle a large number of data with reduced computation time.
Abstract: ENDscript 2 is a friendly Web server for extracting and rendering a comprehensive analysis of primary to quaternary protein structure information in an automated way. This major upgrade has been fully re-engineered to enhance speed, accuracy and usability with interactive 3D visualization. It takes advantage of the new version 3 of ESPript, our well-known sequence alignment renderer, improved to handle a large number of data with reduced computation time. From a single PDB entry or file, ENDscript produces high quality figures displaying multiple sequence alignment of proteins homologous to the query, colored according to residue conservation. Furthermore, the experimental secondary structure elements and a detailed set of relevant biophysical and structural data are depicted. All this information and more are now mapped on interactive 3D PyMOL representations. Thanks to its adaptive and rigorous algorithm, beginner to expert users can modify settings to fine-tune ENDscript to their needs. ENDscript has also been upgraded as an open platform for the visualization of multiple biochemical and structural data coming from external biotool Web servers, with both 2D and 3D representations. ENDscript 2 and ESPript 3 are freely available at http://endscript.ibcp.fr and http://espript.ibcp.fr, respectively.
TL;DR: This work describes a publicly available OpenDR framework that makes it easy to express a forward graphics model and then automatically obtain derivatives with respect to the model parameters and to optimize over them and demonstrates the power and simplicity of programming with OpenDR by using it to solve the problem of estimating human body shape from Kinect depth and RGB data.
Abstract: Inverse graphics attempts to take sensor data and infer 3D geometry, illumination, materials, and motions such that a graphics renderer could realistically reproduce the observed scene. Renderers, however, are designed to solve the forward process of image synthesis. To go in the other direction, we propose an approximate differentiable renderer (DR) that explicitly models the relationship between changes in model parameters and image observations. We describe a publicly available OpenDR framework that makes it easy to express a forward graphics model and then automatically obtain derivatives with respect to the model parameters and to optimize over them. Built on a new auto-differentiation package and OpenGL, OpenDR provides a local optimization method that can be incorporated into probabilistic programming frameworks. We demonstrate the power and simplicity of programming with OpenDR by using it to solve the problem of estimating human body shape from Kinect depth and RGB data.
TL;DR: This paper proposes to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, self-occlusion and sensor noises.
Abstract: The depth information of RGB-D sensors has greatly simplified some common challenges in computer vision and enabled breakthroughs for several tasks. In this paper, we propose to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, self-occlusion and sensor noises. We take a collection of 3D CAD models and render each CAD model from hundreds of viewpoints to obtain synthetic depth maps. For each depth rendering, we extract features from the 3D point cloud and train an Exemplar-SVM classifier. During testing and hard-negative mining, we slide a 3D detection window in 3D space. Experiment results show that our 3D detector significantly outperforms the state-of-the-art algorithms for both RGB and RGB-D images, and achieves about ×1.7 improvement on average precision compared to DPM and R-CNN. All source code and data are available online.
TL;DR: The design goals and software architecture of Embree are described, and it is shown that for secondary rays in particular, the performance is competitive with (and often higher than) existing state-of-the-art methods on CPUs and GPUs.
Abstract: We describe Embree, an open source ray tracing framework for x86 CPUs. Embree is explicitly designed to achieve high performance in professional rendering environments in which complex geometry and incoherent ray distributions are common. Embree consists of a set of low-level kernels that maximize utilization of modern CPU architectures, and an API which enables these kernels to be used in existing renderers with minimal programmer effort. In this paper, we describe the design goals and software architecture of Embree, and show that for secondary rays in particular, the performance of Embree is competitive with (and often higher than) existing state-of-the-art methods on CPUs and GPUs.
TL;DR: A 3D reconstruction and visualization system for automatically producing clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points, with a new algorithm called “inverse constructive solid geometry (CSG)” for reconstructing a scene with a CSG representation consisting of volumetric primitives.
Abstract: Virtual exploration tools for large indoor environments (e.g. museums) have so far been limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. On the other hand, photorealistic aerial maps would be a useful navigational guide for large indoor environments, but it is impossible to directly acquire photographs covering a large indoor environment from aerial viewpoints. This paper presents a 3D reconstruction and visualization system for automatically producing clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. The key component is a new algorithm called "inverse constructive solid geometry (CSG)" for reconstructing a scene with a CSG representation consisting of volumetric primitives, which imposes powerful regularization constraints. We also propose several novel techniques to adjust the 3D model to make it suitable for rendering the 3D maps from aerial viewpoints. The visualization system enables users to easily browse a large-scale indoor environment from a bird's-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panorama views, and zoom out again, all with seamless 3D transitions. We demonstrate our system on various museums, including the Metropolitan Museum of Art in New York City--one of the largest art galleries in the world.
TL;DR: MVE is an end-to-end multi-view geometry reconstruction software which takes photos of a scene as input and produces a surface triangle mesh as result, and provides a graphical user interface for structure-from-motion reconstruction, visual inspection of images, depth maps, and rendering of scenes and meshes.
Abstract: We present MVE, the Multi-View Environment. MVE is an end-to-end multi-view geometry reconstruction software which takes photos of a scene as input and produces a surface triangle mesh as result. The system covers a structure-from-motion algorithm, multi-view stereo reconstruction, generation of extremely dense point clouds, and reconstruction of surfaces from point clouds. In contrast to most image-based geometry reconstruction approaches, our system is focused on reconstruction of multi-scale scenes, an important aspect in many areas such as cultural heritage. It allows to reconstruct large datasets containing some detailed regions with much higher resolution than the rest of the scene. Our system provides a graphical user interface for structure-from-motion reconstruction, visual inspection of images, depth maps, and rendering of scenes and meshes.
TL;DR: This work presents a method for converting first-person videos, for example, captured with a helmet camera during activities such as rock climbing or bicycling, into hyper-lapse videos, i.e., time- lapse videos with a smoothly moving camera.
Abstract: We present a method for converting first-person videos, for example, captured with a helmet camera during activities such as rock climbing or bicycling, into hyper-lapse videos, i.e., time-lapse videos with a smoothly moving camera. At high speed-up rates, simple frame sub-sampling coupled with existing video stabilization methods does not work, because the erratic camera shake present in first-person videos is amplified by the speed-up. Our algorithm first reconstructs the 3D input camera path as well as dense, per-frame proxy geometries. We then optimize a novel camera path for the output video that passes near the input cameras while ensuring that the virtual camera looks in directions that can be rendered well from the input. Finally, we generate the novel smoothed, time-lapse video by rendering, stitching, and blending appropriately selected source frames for each output frame. We present a number of results for challenging videos that cannot be processed using traditional techniques.
TL;DR: This paper describes the design and implementation of MixFab, a mixed-reality environment for personal fabrication that lowers the barrier for users to engage in personal fabrication, and describes a user study evaluating the system's prototype.
Abstract: Personal fabrication machines, such as 3D printers and laser cutters, are becoming increasingly ubiquitous. However, designing objects for fabrication still requires 3D modeling skills, thereby rendering such technologies inaccessible to a wide user-group. In this paper, we introduce MixFab, a mixed-reality environment for personal fabrication that lowers the barrier for users to engage in personal fabrication. Users design objects in an immersive augmented reality environment, interact with virtual objects in a direct gestural manner and can introduce existing physical objects effortlessly into their designs. We describe the design and implementation of MixFab, a user-defined gesture study that informed this design, show artifacts designed with the system and describe a user study evaluating the system's prototype.
TL;DR: This work introduces the transient path integral framework, formally describing light transport in transient state, and proposes a novel density estimation technique that allows reusing sampled paths to reconstruct time-resolved radiance, and devise new sampling strategies that take into account the distribution of radiance along time in participating media.
Abstract: Recent advances in ultra-fast imaging have triggered many promising applications in graphics and vision, such as capturing transparent objects, estimating hidden geometry and materials, or visualizing light in motion. There is, however, very little work regarding the effective simulation and analysis of transient light transport, where the speed of light can no longer be considered infinite. We first introduce the transient path integral framework, formally describing light transport in transient state. We then analyze the difficulties arising when considering the light's time-of-flight in the simulation (rendering) of images and videos. We propose a novel density estimation technique that allows reusing sampled paths to reconstruct time-resolved radiance, and devise new sampling strategies that take into account the distribution of radiance along time in participating media. We then efficiently simulate time-resolved phenomena (such as caustic propagation, fluorescence or temporal chromatic dispersion), which can help design future ultra-fast imaging devices using an analysis-by-synthesis approach, as well as to achieve a better understanding of the nature of light transport.
TL;DR: In this paper, a method for rendering images on a head mounted display (HMD) is presented, which includes operations for tracking, with one or more first cameras inside the HMD, the gaze of a user and for tracking motion of the head-mounted display.
Abstract: Methods, systems, and computer programs are presented for rendering images on a head mounted display (HMD). One method includes operations for tracking, with one or more first cameras inside the HMD, the gaze of a user and for tracking motion of the HMD. The motion of the HMD is tracked by analyzing images of the HMD taken with a second camera that is not in the HMD. Further, the method includes an operation for predicting the motion of the gaze of the user based on the gaze and the motion of the HMD. Rendering policies for a plurality of regions, defined on a view rendered by the HMD, are determined based on the predicted motion of the gaze. The images are rendered on the view based on the rendering policies.
TL;DR: In this article, a method for in-vehicle dynamic virtual reality includes receiving vehicle data from one or more vehicle systems of a vehicle, wherein the vehicle data includes vehicle dynamics data and receiving user data from a virtual reality device.
Abstract: A method for in-vehicle dynamic virtual reality includes receiving vehicle data from one or more vehicle systems of a vehicle, wherein the vehicle data includes vehicle dynamics data and receiving user data from a virtual reality device. The method includes generating a virtual view based on the vehicle data, the user data and a virtual world model, the virtual world model including one or more components that define the virtual view, wherein generating the virtual view includes augmenting one or more components of the virtual world model according to at least one of the vehicle data and the user data and rendering the virtual view to an output device by controlling the output device to update display of the virtual view according to the vehicle dynamics data.
TL;DR: Experiments show that the method learns to make multiple predictions that are marginally relevant and can effectively select an accurate prediction, and outperforms the state-of-the-art discriminative approach for camera relocalization.
Abstract: We address the problem of estimating the pose of a cam- era relative to a known 3D scene from a single RGB-D frame. We formulate this problem as inversion of the generative rendering procedure, i.e., we want to find the camera pose corresponding to a rendering of the 3D scene model that is most similar with the observed input. This is a non-convex optimization problem with many local optima. We propose a hybrid discriminative-generative learning architecture that consists of: (i) a set of M predictors which generate M camera pose hypotheses, and (ii) a 'selector' or 'aggregator' that infers the best pose from the multiple pose hypotheses based on a similarity function. We are interested in predictors that not only produce good hypotheses but also hypotheses that are different from each other. Thus, we propose and study methods for learning 'marginally relevant' predictors, and compare their performance when used with different selection procedures. We evaluate our method on a recently released 3D reconstruction dataset with challenging camera poses, and scene variability. Experiments show that our method learns to make multiple predictions that are marginally relevant and can effectively select an accurate prediction. Furthermore, our method outperforms the state-of-the-art discriminative approach for camera relocalization.
TL;DR: A method for resampling the texture models so they can be rendered at a sampling rate other than the 10 kHz used when recording data, to increase the adaptability and utility of HaTT.
Abstract: This paper introduces the Penn Haptic Texture Toolkit (HaTT), a publicly available repository of haptic texture models for use by the research community. HaTT includes 100 haptic texture and friction models, the recorded data from which the models were made, images of the textures, and the code and methods necessary to render these textures using an impedance-type haptic interface such as a SensAble Phantom Omni. This paper reviews our previously developed methods for modeling haptic virtual textures, describes our technique for modeling Coulomb friction between a tooltip and a surface, discusses the adaptation of our rendering methods for display using an impedance-type haptic device, and provides an overview of the information included in the toolkit. Each texture and friction model was based on a ten-second recording of the force, speed, and high-frequency acceleration experienced by a handheld tool moved by an experimenter against the surface in a natural manner. We modeled each texture's recorded acceleration signal as a piecewise autoregressive (AR) process and stored the individual AR models in a Delaunay triangulation as a function of the force and speed used when recording the data. To increase the adaptability and utility of HaTT, we developed a method for resampling the texture models so they can be rendered at a sampling rate other than the 10 kHz used when recording data. Measurements of the user's instantaneous normal force and tangential speed are used to synthesize texture vibrations in real time. These vibrations are transformed into a texture force vector that is added to the friction and normal force vectors for display to the user.
TL;DR: In this article, a method for receiving location data of a monitoring device when carried by a user and receiving motion data of the monitoring device is described. The method includes processing the received motion data to identify a group of motion data having a substantially common characteristic and processing the location data for the group of the motion data.
Abstract: A method includes receiving location data of a monitoring device when carried by a user and receiving motion data of the monitoring device. The motion data is associated with a time of occurrence and the location data. The method includes processing the received motion data to identify a group of the motion data having a substantially common characteristic and processing the location data for the group of the motion data. The group of motion data by way of processing the location data provides an activity identifier. The motion data includes metric data that identifies characteristics of the motion data. The method includes transferring the activity identifier and the characteristics of the motion data to a screen of a device for display. The activity identifier being a graphical user interface that receives an input for rendering more or less of the characteristics of the motion data.
TL;DR: 4D Video Textures introduce a novel representation for rendering video‐realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio that achieves >90% reduction in size and halves the rendering cost.
Abstract: 4D Video Textures 4DVT introduce a novel representation for rendering video-realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio. 4D performance capture reconstructs dynamic shape and appearance over time but is limited to free-viewpoint video replay of the same motion. Interactive animation from 4D performance capture has so far been limited to surface shape only. 4DVT is the final piece in the puzzle enabling video-realistic interactive animation through two contributions: a layered view-dependent texture map representation which supports efficient storage, transmission and rendering from multiple view video capture; and a rendering approach that combines multiple 4DVT sequences in a parametric motion space, maintaining video quality rendering of dynamic surface appearance whilst allowing high-level interactive control of character motion and viewpoint. 4DVT is demonstrated for multiple characters and evaluated both quantitatively and through a user-study which confirms that the visual quality of captured video is maintained. The 4DVT representation achieves >90% reduction in size and halves the rendering cost.
TL;DR: An in-depth security analysis on GPUs is performed to detect security vulnerabilities and proposed attack methods for revealing a victim program's data kept in GPU memory both during its execution and right after its termination are proposed.
Abstract: Graphics processing units (GPUs) are important components of modern computing devices for not only graphics rendering, but also efficient parallel computations. However, their security problems are ignored despite their importance and popularity. In this paper, we first perform an in-depth security analysis on GPUs to detect security vulnerabilities. We observe that contemporary, widely-used GPUs, both NVIDIA's and AMD's, do not initialize newly allocated GPU memory pages which may contain sensitive user data. By exploiting such vulnerabilities, we propose attack methods for revealing a victim program's data kept in GPU memory both during its execution and right after its termination. We further show the high applicability of the proposed attacks by applying them to the Chromium and Firefox web browsers which use GPUs for accelerating webpage rendering. We detect that both browsers leave rendered webpage textures in GPU memory, so that we can infer which web pages a victim user has visited by analyzing the remaining textures. The accuracy of our advanced inference attack that uses both pixel sequence matching and RGB histogram matching is up to 95.4%.
TL;DR: In this article, a recently patented rendering based on silica-aerogels is used for exterior thermal insulation applications, which can be used for retrofitting existing buildings since it has a high insulation performance and its application is easy, compatible with the traditional masonry facades, and using the well-known ordinary techniques.
TL;DR: This survey review and classify the existing techniques for advanced volumetric illumination based on their technical realization, their performance behaviour as well as their perceptual capabilities will define future challenges in the area of interactive advanced voluetric illumination.
Abstract: Interactive volume rendering in its standard formulation has become an increasingly important tool in many application domains. In recent years several advanced volumetric illumination techniques to be used in interactive scenarios have been proposed. These techniques claim to have perceptual benefits as well as being capable of producing more realistic volume rendered images. Naturally, they cover a wide spectrum of illumination effects, including varying shading and scattering effects. In this survey, we review and classify the existing techniques for advanced volumetric illumination. The classification will be conducted based on their technical realization, their performance behaviour as well as their perceptual capabilities. Based on the limitations revealed in this review, we will define future challenges in the area of interactive advanced volumetric illumination.
TL;DR: In this article, techniques for specifying audio rendering information in a bitstream are described, and a device configured to generate the bitstream may perform various aspects of the techniques, such as identifying an audio renderer used when generating the multi-channel audio content.
Abstract: In general, techniques are described for specifying audio rendering information in a bitstream. A device configured to generate the bitstream may perform various aspects of the techniques. The bitstream generation device may comprise one or more processors configured to specify audio rendering information that includes a signal value identifying an audio renderer used when generating the multi-channel audio content. A device configured to render multi-channel audio content from a bitstream may also perform various aspects of the techniques. The rendering device may comprise one or more processors configured to determine audio rendering information that includes a signal value identifying an audio renderer used when generating the multi-channel audio content, and render a plurality of speaker feeds based on the audio rendering information.
TL;DR: This work presents a real-time system for six-degrees-of-freedom ego localization that uses only a single monocular camera and describes a process to automatically extract the ingredients of this map from stereoscopic image sequences.
Abstract: Autonomous and intelligent vehicles will undoubtedly depend on an accurate ego localization solution. Global navigation satellite systems suffer from multipath propagation rendering this solution insufficient. Herein, we present a real-time system for six-degrees-of-freedom ego localization that uses only a single monocular camera. The camera image is harnessed to yield an ego pose relative to a previously computed visual map. We describe a process to automatically extract the ingredients of this map from stereoscopic image sequences. These include a mapping trajectory relative to the first pose, global scene signatures and local landmark descriptors. The localization algorithm then consists of a topological localization step that completely obviates the need for any global positioning sensors such as GNSS. A metric refinement step that recovers an accurate metric pose is subsequently applied. Metric localization recovers the ego pose in a factor graph optimization process based on local landmarks. We demonstrate centimeter-level accuracy by a set of experiments in an urban environment. To this end, two localization estimates are computed for two independent cameras mounted on the same vehicle. These two independent trajectories are thereafter compared for consistency. Finally, we present qualitative experiments of an augmented reality (AR) system that depends on the aforementioned localization solution. Several screen shots of the AR system are shown confirming centimeter-level accuracy and subdegree angular precision.
TL;DR: An easy‐to‐follow, introductory tutorial of the many‐light theory is given; a comprehensive, unified survey of the topic is provided with a comparison of the main algorithms; limitations regarding materials and light transport phenomena are discussed and a vision to motivate and guide future research is presented.
Abstract: Recent years have seen increasing attention and significant progress in many-light rendering, a class of methods for efficient computation of global illumination. The many-light formulation offers a unified mathematical framework for the problem reducing the full lighting transport simulation to the calculation of the direct illumination from many virtual light sources. These methods are unrivaled in their scalability: they are able to produce plausible images in a fraction of a second but also converge to the full solution over time. In this state-of-the-art report, we give an easy-to-follow, introductory tutorial of the many-light theory; provide a comprehensive, unified survey of the topic with a comparison of the main algorithms; discuss limitations regarding materials and light transport phenomena and present a vision to motivate and guide future research. We will cover both the fundamental concepts as well as improvements, extensions and applications of many-light rendering.
TL;DR: This report reviews the existing compressed GPU volume rendering approaches, covering sampling grid layouts, compact representation models, compression techniques, GPU rendering architectures and fast decoding techniques.
Abstract: Great advancements in commodity graphics hardware have favoured graphics processing unit GPU-based volume rendering as the main adopted solution for interactive exploration of rectilinear scalar volumes on commodity platforms. Nevertheless, long data transfer times and GPU memory size limitations are often the main limiting factors, especially for massive, time-varying or multi-volume visualization, as well as for networked visualization on the emerging mobile devices. To address this issue, a variety of level-of-detail LOD data representations and compression techniques have been introduced. In order to improve capabilities and performance over the entire storage, distribution and rendering pipeline, the encoding/decoding process is typically highly asymmetric, and systems should ideally compress at data production time and decompress on demand at rendering time. Compression and LOD pre-computation does not have to adhere to real-time constraints and can be performed off-line for high-quality results. In contrast, adaptive real-time rendering from compressed representations requires fast, transient and spatially independent decompression. In this report, we review the existing compressed GPU volume rendering approaches, covering sampling grid layouts, compact representation models, compression techniques, GPU rendering architectures and fast decoding techniques.
TL;DR: Modeling and experimental results on both ultrasonic and electrostatic surface haptic devices are presented, characterizing their dynamics and their bandwidth for rendering haptic effects.
Abstract: Surface haptic devices modulate the friction between the surface and the fingertip, and can thus be used to create a tactile perception of surface features or textures We present modeling and experimental results on both ultrasonic and electrostatic surface haptic devices, characterizing their dynamics and their bandwidth for rendering haptic effects
TL;DR: In this article, a multi-touch sensitive display is used to guide the user in a performance based on a musical score, or in addition, uncued freestyle modes of operation may be provided.
Abstract: Synthetic multi-string musical instruments have been developed for capturing and rendering musical performances on handheld or other portable devices in which a multi-touch sensitive display provides one of the input vectors for an expressive performance by a user or musician. Visual cues may be provided on the multi-touch sensitive display to guide the user in a performance based on a musical score. Alternatively, or in addition, uncued freestyle modes of operation may be provided. In either case, it is not the musical score that drives digital synthesis and audible rendering of the synthetic multi-string musical instrument. Rather, it is the stream of user gestures captured at least in part using the multi-touch sensitive display that drives the digital synthesis and audible rendering.
TL;DR: For search and spatial judgment tasks with isosurface visualization, a stereoscopic display provides better performance, but for tasks with 3D texture-based rendering, displays with higher field of regard were more effective, independent of the levels of the other display components.
Abstract: Volume visualization is an important technique for analyzing datasets from a variety of different scientific domains. Volume data analysis is inherently difficult because volumes are three-dimensional, dense, and unfamiliar, requiring scientists to precisely control the viewpoint and to make precise spatial judgments. Researchers have proposed that more immersive (higher fidelity) VR systems might improve task performance with volume datasets, and significant results tied to different components of display fidelity have been reported. However, more information is needed to generalize these results to different task types, domains, and rendering styles. We visualized isosurfaces extracted from synchrotron microscopic computed tomography (SR-μCT) scans of beetles, in a CAVE-like display. We ran a controlled experiment evaluating the effects of three components of system fidelity (field of regard, stereoscopy, and head tracking) on a variety of abstract task categories that are applicable to various scientific domains, and also compared our results with those from our prior experiment using 3D texture-based rendering. We report many significant findings. For example, for search and spatial judgment tasks with isosurface visualization, a stereoscopic display provides better performance, but for tasks with 3D texture-based rendering, displays with higher field of regard were more effective, independent of the levels of the other display components. We also found that systems with high field of regard and head tracking improve performance in spatial judgment tasks. Our results extend existing knowledge and produce new guidelines for designing VR systems to improve the effectiveness of volume data analysis.
TL;DR: In this article, the renderers used for rendering spherical harmonic coefficients to generate one or more loudspeaker signals are described. But they do not specify the local speaker geometry of the renderer.
Abstract: In general, techniques are described for determining renderers used for rendering spherical harmonic coefficients to generate one or more loudspeaker signals. A device comprising one or more processors may perform the techniques. The one or more processors may be configured to determine a local speaker geometry of one or more speakers used for playback of spherical harmonic coefficients representative of a sound field, and configure the device to operate based on the local speaker geometry.
TL;DR: A novel image-plane adaptive sampling and reconstruction method based on local regression theory that produces more accurate and visually pleasing results over the state-of-the-art techniques across a wide range of rendering effects.
Abstract: Monte Carlo ray tracing is considered one of the most effective techniques for rendering photo-realistic imagery, but requires a large number of ray samples to produce converged or even visually pleasing images. We develop a novel image-plane adaptive sampling and reconstruction method based on local regression theory. A novel local space estimation process is proposed for employing the local regression, by robustly addressing noisy high-dimensional features. Given the local regression on estimated local space, we provide a novel two-step optimization process for selecting bandwidths of features locally in a data-driven way. Local weighted regression is then applied using the computed bandwidths to produce a smooth image reconstruction with well-preserved details. We derive an error analysis to guide our adaptive sampling process at the local space. We demonstrate that our method produces more accurate and visually pleasing results over the state-of-the-art techniques across a wide range of rendering effects. Our method also allows users to employ an arbitrary set of features, including noisy features, and robustly computes a subset of them by ignoring noisy features and decorrelating them for higher quality.
TL;DR: In this paper, an image encoder and an image grading unit were proposed to allow graders to make optimally looking content of HDR scenes for various rendering displays, and a new saturation processing strategy useful in the newly emerging high dynamic range image handling technology.
Abstract: To allow graders to make optimally looking content of HDR scenes for various rendering displays, we invented an image encoder (202) comprising: an input (240) for a high dynamic range input image (M_HDR); an image grading unit (201) arranged to allow a human color grader to specify a color mapping from a representation (HDR_REP) of the high dynamic range input image defined according to a predefined accuracy, to a low dynamic range image (Im_LDR) by means of a human-determined color mapping algorithm, and arranged to output data specifying the color mapping (Fi(MP_DH)); and an automatic grading unit (203) arranged to derive a second low dynamic range image (GT_IDR) by applying an automatic color mapping algorithm to one of the high dynamic range input image (M_HDR) or the low dynamic range image (Im_LDR). We also describe and interesting new saturation processing strategy useful in the newly emerging high dynamic range image handling technology.
TL;DR: An efficient and scalable method for convolutionally rendering acoustic parameters that generates artifact-free audio even for fast motion and sudden changes in reverberance is introduced, integrated with Unreal Engine 3™.
Abstract: The acoustic wave field in a complex scene is a chaotic 7D function of time and the positions of source and listener, making it difficult to compress and interpolate. This hampers precomputed approaches which tabulate impulse responses (IRs) to allow immersive, real-time sound propagation in static scenes. We code the field of time-varying IRs in terms of a few perceptual parameters derived from the IR's energy decay. The resulting parameter fields are spatially smooth and compressed using a lossless scheme similar to PNG. We show that this encoding removes two of the seven dimensions, making it possible to handle large scenes such as entire game maps within 100MB of memory. Run-time decoding is fast, taking 100μs per source. We introduce an efficient and scalable method for convolutionally rendering acoustic parameters that generates artifact-free audio even for fast motion and sudden changes in reverberance. We demonstrate convincing spatially-varying effects in complex scenes including occlusion/obstruction and reverberation, in our system integrated with Unreal Engine 3™.
TL;DR: In this paper, a multi dynamic environment and location based active augmented reality (AR) system is described, which uses dynamic scanning, active reference marker positioning, inertial measurement, imaging, mapping and rendering to generate an AR for a physical environment.
Abstract: A multi dynamic environment and location based active augmented reality (AR) system is described. The system uses dynamic scanning, active reference marker positioning, inertial measurement, imaging, mapping and rendering to generate an AR for a physical environment. The scanning and imaging are performed from the perspective of a user wearing a head mounted or wearable display in the physical environment.