TL;DR: This paper proposes ARCH (Animatable Reconstruction of Clothed Humans), a novel end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image and shows numerous qualitative examples of animated, high-quality reconstructed avatars unseen in the literature so far.
Abstract: In this paper, we propose ARCH (Animatable Reconstruction of Clothed Humans), a novel end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image. Existing approaches to digitize 3D humans struggle to handle pose variations and recover details. Also, they do not produce models that are animation ready. In contrast, ARCH is a learned pose-aware model that produces detailed 3D rigged full-body human avatars from a single unconstrained RGB image. A Semantic Space and a Semantic Deformation Field are created using a parametric 3D body estimator. They allow the transformation of 2D/3D clothed humans into a canonical space, reducing ambiguities in geometry caused by pose variations and occlusions in training data. Detailed surface geometry and appearance are learned using an implicit function representation with spatial local features. Furthermore, we propose additional per-pixel supervision on the 3D reconstruction using opacity-aware differentiable rendering. Our experiments indicate that ARCH increases the fidelity of the reconstructed humans. We obtain more than 50% lower reconstruction errors for standard metrics compared to state-of-the-art methods on public datasets. We also show numerous qualitative examples of animated, high-quality reconstructed avatars unseen in the literature so far.
TL;DR: This work uses deep reinforcement learning to learn controllers that achieve goal-directed movements in data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs.
Abstract: A fundamental problem in computer animation is that of realizing purposeful and realistic human movement given a sufficiently-rich set of motion capture clips. We learn data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs. The latent variables of the learned autoencoder define the action space for the movement and thereby govern its evolution over time. Planning or control algorithms can then use this action space to generate desired motions. In particular, we use deep reinforcement learning to learn controllers that achieve goal-directed movements. We demonstrate the effectiveness of the approach on multiple tasks. We further evaluate system-design choices and describe the current limitations of Motion VAEs.
TL;DR: The authors embeds real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image, while maintaining facial integrity.
Abstract: Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated control parameter. Very recently, high-quality semantically controlled editing has been demonstrated, however only on synthetically created StyleGAN images. We present the first approach for embedding real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image. Semantic editing in parameter space is achieved based on StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. We design a novel hierarchical non-linear optimization problem to obtain the embedding. An identity preservation energy term allows spatially coherent edits while maintaining facial integrity. Our approach runs at interactive frame rates and thus allows the user to explore the space of possible edits. We evaluate our approach on a wide set of portrait photos, compare it to the current state of the art, and validate the effectiveness of its components in an ablation study.
TL;DR: This work proposes a 3D-aware generative network along with a hybrid embedding module and a non-linear composition module that achieves controllable, photo-realistic, and temporally coherent talking-head videos with natural head movements.
Abstract: When people deliver a speech, they naturally move heads, and this rhythmic head motion conveys prosodic information. However, generating a lip-synced video while moving head naturally is challenging. While remarkably successful, existing works either generate still talking-face videos or rely on landmark/video frames as sparse/dense mapping guidance to generate head movements, which leads to unrealistic or uncontrollable video synthesis. To overcome the limitations, we propose a 3D-aware generative network along with a hybrid embedding module and a non-linear composition module. Through modeling the head motion and facial expressions (In our setting, facial expression means facial movement (e.g., blinks, and lip & chin movements).) explicitly, manipulating 3D animation carefully, and embedding reference images dynamically, our approach achieves controllable, photo-realistic, and temporally coherent talking-head videos with natural head movements. Thoughtful experiments on several standard benchmarks demonstrate that our method achieves significantly better results than the state-of-the-art methods in both quantitative and qualitative comparisons. The code is available on https://github.com/lelechen63/Talking-head-Generation-with-Rhythmic-Head-Motion.
TL;DR: In this article, a non-linear morphable face model is proposed to generate multifarious face geometry of pore-level resolution, coupled with material attributes for use in physically-based rendering.
Abstract: Based on a combined data set of 4000 high resolution facial scans, we introduce a non-linear morphable face model, capable of producing multifarious face geometry of pore-level resolution, coupled with material attributes for use in physically-based rendering. We aim to maximize the variety of the participant’s face identities, while increasing the robustness of correspondence between unique components, including middle-frequency geometry, albedo maps, specular intensity maps and high-frequency displacement details. Our deep learning based generative model learns to correlate albedo and geometry, which ensures the anatomical correctness of the generated assets. We demonstrate potential use of our generative model for novel identity generation, model fitting, interpolation, animation, high fidelity data visualization, and low-to-high resolution data domain transferring. We hope the release of this generative model will encourage further cooperation between all graphics, vision, and data focused professionals, while demonstrating the cumulative value of every individual’s complete biometric profile.
TL;DR: This work proposes a simple model that can animate any outfit, independently of its topology, vertex order or connectivity, and proposes a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality.
Abstract: We present a novel solution to the garment animation problem through deep learning. Our contribution allows animating any template outfit with arbitrary topology and geometric complexity. Recent works develop models for garment edition, resizing and animation at the same time by leveraging the support body model (encoding garments as body homotopies). This leads to complex engineering solutions that suffer from scalability, applicability and compatibility. By limiting our scope to garment animation only, we are able to propose a simple model that can animate any outfit, independently of its topology, vertex order or connectivity. Our proposed architecture maps outfits to animated 3D models into the standard format for 3D animation (blend weights and blend shapes matrices), automatically providing of compatibility with any graphics engine. We also propose a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality.
TL;DR: ScalarFlow is presented, a first large-scale data set of reconstructions of real-world smoke plumes and a framework for accurate physics-based reconstructions from a small number of video streams, using a novel estimation of unseen inflow regions and an efficient regularization scheme.
Abstract: In this paper, we present ScalarFlow, a first large-scale data set of reconstructions of real-world smoke plumes. We additionally propose a framework for accurate physics-based reconstructions from a small number of video streams. Central components of our algorithm are a novel estimation of unseen inflow regions and an efficient regularization scheme. Our data set includes a large number of complex and natural buoyancy-driven flows. The flows transition to turbulent flows and contain observable scalar transport processes. As such, the ScalarFlow data set is tailored towards computer graphics, vision, and learning applications. The published data set will contain volumetric reconstructions of velocity and density, input image sequences, together with calibration data, code, and instructions how to recreate the commodity hardware capture setup. We further demonstrate one of the many potential application areas: a first perceptual evaluation study, which reveals that the complexity of the captured flows requires a huge simulation resolution for regular solvers in order to recreate at least parts of the natural complexity contained in the captured data.
TL;DR: This article distinguishes the synthesis of isolated signs deprived of any contextual inflections from the generation of full sign language utterances, and constitutes a survey of the challenges specific to sign languages avatars.
TL;DR: A system for generating and visualizing interactive 3D Augmented Reality tutorials based on 2D video input, which allows viewpoint control at runtime and develops a presentation system that uses commonly available hardware to make the results accessible for home use.
Abstract: We present a system for generating and visualizing interactive 3D Augmented Reality tutorials based on 2D video input, which allows viewpoint control at runtime. Inspired by assembly planning, we analyze the input video using a 3D CAD model of the object to determine an assembly graph that encodes blocking relationships between parts. Using an assembly graph enables us to detect assembly steps that are otherwise difficult to extract from the video, and generally improves object detection and tracking by providing prior knowledge about movable parts. To avoid information loss, we combine the 3D animation with relevant parts of the 2D video so that we can show detailed manipulations and tool usage that cannot be easily extracted from the video. To further support user orientation, we visually align the 3D animation with the real-world object by using texture information from the input video. We developed a presentation system that uses commonly available hardware to make our results accessible for home use and demonstrate the effectiveness of our approach by comparing it to traditional video tutorials.
TL;DR: This article focuses on the development of a novel machine learning based proof of concept for real-time human pose estimation using data collected from sparse inertial measurement units (IMU) system which is cost-effective and least intrusive in the scope of skilled crafts domain.
Abstract: With recent advances in various hardware technologies, human motion capturing (MoCap) has gained importance in the fields such as computer vision, computer animation, gesture recognition in gaming, and most importantly in bio-mechanical analysis. In this direction, human motion is being captured using various kinds of sensors. Correspondingly, many model-based and data-based techniques have been developed in order to decode sensor readings into information understandable by a person. Given that the current technologies still lack applicability in real-world scenarios considering cost and ease of information gathering, leaves substantial room for improvement. This article focuses on the development of a novel machine learning based proof of concept for real-time human pose estimation using data collected from sparse inertial measurement units (IMU) system which is cost-effective and least intrusive in the scope of skilled crafts domain. Training diverse bi-directional recurrent neural networks (bi-RNN) with variable window size and building an ensemble of these models to estimate human pose in terms of human-joints' angles more accurately and robustly is discussed.
TL;DR: It was found that foot tracking, followed by mouth animation and finger tracking, were the features that added the most to the sense of control of a self-representing avatar.
Abstract: We present two experiments to assess the relative impact of different levels of body animation fidelity on plausibility illusion (Psi). The first experiment presents a virtual character that is not controlled by the user ( n = 13) while the second experiment presents a user-controlled virtual avatar ( n = 24, all male). Psi concerns how realistic and coherent the events in a virtual environment look and feel and is part of Slater's proposition of two orthogonal components of presence in virtual reality (VR). In the experiments, the face, hands, upper body and lower body of the character or self-avatar were manipulated to present different degrees of animation fidelity, such as no animation, procedural animation, and motion captured animation. Participants started the experiment experiencing the best animation configuration. Then, animation features were reduced to limit the amount of captured information made available to the system. Participants had to move from this basic animation configuration towards a more complete one, and declare when the avatar animation realism felt equivalent to the initial and most complete configuration, which could happen before all animation features were maxed out. Participants in the self-avatar experiment were also asked to rate how each animation feature affected their sense of control of the virtual body. We found that a virtual body with upper and lower body animated using eight tracked rigid bodies and inverse kinematics (IK) was often perceived as equivalent to a professional capture pipeline relying on 53 markers. Compared to what standard VR kits in the market are offering, i.e. a tracked headset and two hand controllers, we found that foot tracking, followed by mouth animation and finger tracking, were the features that added the most to the sense of control of a self-representing avatar. In addition, these features were often among the first to be improved in both experiments.
TL;DR: In this article, the authors focused on illustrating potential applications of animations in language learning and education, identifying evidence-based principles for their design and use, and proposing possible research works.
Abstract: The use of animation computers in today's world is beneficial in many sectors. Animations development has changed over time and usage context, and illustrate phenomena and concepts that difficult to understand. Animations may not always be useful, however, and teachers using animations need to understand the importance of it. This paper focused on illustrating potential applications of animations in language learning and education, on identifying evidence- based principles for their design and use, and on proposing possible research works. The animation in animation involves the use of compelling graphics, including images, audio and video in the form of technology. However, the use of animations or computers is not limited to education alone, in addition to the education sector, the economic sector, the business sector, the medical sector. However, there is a great deal of debate about the effectiveness of computer animation in various fields. The general findings can be concluded that the role of computer animation is constructive in mostly in language learning.
TL;DR: Qualitative evaluation and Online Turing tests demonstrate the efficacy of OneShotA2V, a novel approach to synthesize a talking person video of arbitrary length using as input: an audio signal and a single unseen image of a person.
Abstract: Audio to Video generation is an interesting problem that has numerous applications across industry verticals including film making, multi-media, marketing, education and others. High-quality video generation with expressive facial movements is a challenging problem that involves complex learning steps for generative adversarial networks. Further, enabling one-shot learning for an unseen single image increases the complexity of the problem while simultaneously making it more applicable to practical scenarios.In the paper, we propose a novel approach OneShotA2V to synthesize a talking person video of arbitrary length using as input: an audio signal and a single unseen image of a person. OneShotA2V leverages curriculum learning to learn movements of expressive facial components and hence generates a high-quality talking head video of the given person.Further, it feeds the features generated from the audio input directly into a generative adversarial network and it adapts to any given unseen selfie by applying few-shot learning with only a few output updation epochs. OneShotA2V leverages spatially adaptive normalization based multi-level generator and multiple multi-level discriminators based architecture. The input audio clip is not restricted to any specific language, which gives the method multilingual applicability. Experimental evaluation demonstrates superior performance of OneShotA2V as compared to Realistic Speech-Driven Facial Animation with GANs(RSDGAN) [43], Speech2Vid [8], and other approaches, on multiple quantitative metrics including: SSIM (structural similarity index), PSNR (peak signal to noise ratio) and CPBD (image sharpness). Further, qualitative evaluation and Online Turing tests demonstrate the efficacy of our approach.
TL;DR: This paper presents an image animating method for enhancing single still image in social media with virtual realistic and animated motions without prior information that produces visually natural results while guaranteeing motion harmony between active objects and passive objects.
TL;DR: By mainly focusing on bottle mouthdefect detection, the detection system dedicates more attention to the user and the task, and will eventually yield much better results as a training tool for imageprocessing education.
Abstract: Background Machine learning-based beer bottle-defect detection is a complex technology that runs automatically; however, it consumes considerable memory, is expensive, and poses a certain danger when training novice operators. Moreover, some topics are difficult to learn from experimental lectures, such as digital image processing and computer vision. However, virtual simulation experiments have been widely used to good effect within education. A virtual simulation of the design and manufacture of a beer bottle-defect detection system will not only help the students to increase their image-processing knowledge, but also improve their ability to solve complex engineering problems and design complex systems. Methods The hardware models for the experiment (camera, light source, conveyor belt, power supply, manipulator, and computer) were built using the 3DS MAX modeling and animation software. The Unreal Engine 4 (UE4) game engine was utilized to build a virtual design room, design the interactive operations, and simulate the system operation. Results The results showed that the virtual-simulation system received much better experimental feedback, which facilitated the design and manufacture of a beer bottle-defect detection system. The specialized functions of the functional modules in the detection system, including a basic experimental operation menu, power switch, image shooting, image processing, and manipulator grasping, allowed students (or virtual designers) to easily build a detection system by retrieving basic models from the model library, and creating the beer-bottle transportation, image shooting, image processing, defect detection, and defective-product removal. The virtual simulation experiment was completed with image processing as the main body. Conclusions By mainly focusing on bottle mouthdefect detection, the detection system dedicates more attention to the user and the task. With more detailed tasks available, the virtual system will eventually yield much better results as a training tool for imageprocessing education. In addition, a novel visual perception-thinking pedagogical framework enables better comprehension than the traditional lecture-tutorial style.
TL;DR: This paper examines how the performance of forgery detectors depends on the presence of artefacts that the human eye can see and introduces a new family of detectors that examine combinations of spatial and temporal features and outperform existing approaches both in terms of detection accuracy and generalization.
Abstract: New approaches to synthesize and manipulate face videos at very high quality have paved the way for new applications in computer animation, virtual and augmented reality, or face video analysis. However, there are concerns that they may be used in a malicious way, e.g. to manipulate videos of public figures, politicians or reporters, to spread false information. The research community therefore developed techniques for automated detection of modified imagery, and assembled benchmark datasets showing manipulatons by state-of-the-art techniques. In this paper, we contribute to this initiative in two ways: First, we present a new audio-visual benchmark dataset. It shows some of the highest quality visual manipulations available today. Human observers find them significantly harder to identify as forged than videos from other benchmarks. Furthermore we propose new family of deep-learning-based fake detectors, demonstrating that existing detectors are not well-suited for detecting fakes of a quality as high as presented in our dataset. Our detectors examine spatial and temporal features. This allows them to outperform existing approaches both in terms of high detection accuracy and generalization to unseen fake generation methods and unseen identities.
TL;DR: An implicit linearized contact model is derived based on a predictor‐corrector approach that leads to consistent behavior with higher‐order integrators as predictors and is well suited for the simulation of stiff, nonlinear materials with the integration methods presented in this paper.
Abstract: Visually appealing and vivid simulations of deformable solids represent an important aspect of physically based computer animation. For the temporal discretization, it is customary in computer animation to use first-order accurate integration methods, such as Backward Euler, due to their simplicity and robustness. Although there is notable research on second-order methods, their use is not widespread. Many of these well-known methods have significant drawbacks such as severe numerical damping or scene-dependent time step restrictions to ensure stability. In this paper, we discuss the most relevant requirements on such methods in computer animation and motivate the interest beyond first-order accuracy. Keeping these requirements in mind, we investigate several promising methods from the families of diagonally implicit Runge-Kutta (DIRK) and Rosenbrock methods which currently do not appear to have considerable popularity in this field. We show that the usage of such methods improves the visual quality of physical animations. In addition, we demonstrate that they allow distinctly more control over damping at lower computational cost than classical methods. As part of our theoretical contribution, we review aspects of simulations that are often considered more intricate with higher-order methods, such as contact handling. To this end, we derive an implicit linearized contact model based on a predictor-corrector approach that leads to consistent behavior with higher-order integrators as predictors. Our contact model is well suited for the simulation of stiff, nonlinear materials with the integration methods presented in this paper and more common methods such as Backward Euler alike.
TL;DR: A pipeline for creating a synthetic thermal image dataset is developed and the effectiveness of the approach is evaluated using a number of deep learning algorithms that may enable human-machine interaction such as head pose estimation and face detection.
Abstract: Thermal infrared imaging holds promise for human-machine interaction in vehicles owing to superior performance in low-light and low-visibility conditions, and the potential for monitoring human psycho-physiological state However, the shortage of large-scale 2D thermal image datasets and public benchmarks has hindered progress of deep-learning-based solutions To tackle this problem, we develop a pipeline for creating a synthetic thermal image dataset Firstly, 3D models of human heads are generated from uncalibrated TIR images (without additional visible or depth images) using photogrammetry techniques A synthetic dataset of 100k images of 640×480 resolution are then generated by rendering each of the five 3D models for a range of head poses, camera positions and backgrounds using commercial animation software The effectiveness of the approach is evaluated using a number of deep learning algorithms that may enable human-machine interaction such as head pose estimation and face detection The neural networks are trained on the new synthetic thermal dataset, before fine tuning on real world data where possible
TL;DR: In this article, a hybrid geometry and video-based animation approach is proposed to combine the flexibility of classical CG animation with the realism of real captured data, where coarse movements and poses are modeled in the geometry only, while very fine and subtle details in the face, often lacking in purely geometric methods, are captured in videobased textures.
Abstract: In this paper, we present an end-to-end pipeline for the creation of high-quality animatable volumetric video content of human performances. Going beyond the application of free-viewpoint volumetric video, we allow re-animation and alteration of an actor's performance through (i) the enrichment of the captured data with semantics and animation properties and (ii) applying hybrid geometry- and video-based animation methods that allow a direct animation of the high-quality data itself instead of creating an animatable model that resembles the captured data. Semantic enrichment and geometric animation ability are achieved by establishing temporal consistency in the 3D data, followed by an automatic rigging of each frame using a parametric shape-adaptive full human body model. Our hybrid geometry- and video-based animation approaches combine the flexibility of classical CG animation with the realism of real captured data. For pose editing, we exploit the captured data as much as possible and kinematically deform the captured frames to fit a desired pose. Further, we treat the face differently from the body in a hybrid geometry- and video-based animation approach where coarse movements and poses are modeled in the geometry only, while very fine and subtle details in the face, often lacking in purely geometric methods, are captured in video-based textures. These are processed to be interactively combined to form new facial expressions. On top of that, we learn the appearance of regions that are challenging to synthesize, such as the teeth or the eyes, and fill in missing regions realistically in an autoencoder-based approach. This paper covers the full pipeline from capturing and producing high-quality video content, over the enrichment with semantics and deformation properties for re-animation and processing of the data for the final hybrid animation.
TL;DR: This article develops and demonstrates the complete algorithm including the gradient descent algorithm, the density estimation algorithm, image acquisition/processing and physical motion on a simulator built on a 3-D animation software and an experimental testbed.
Abstract: In this article, we address visual surveillance of human activities for a network of cameras with controllable orientations based on gradient-based coverage control techniques. We first formulate the problem as an optimization problem on the matrix manifold $SO(3)$ and then derive the gradient for the cost function using a density function defined on the image plane of each camera. We then develop a real-time density estimation algorithm using computer vision techniques including a real-time pedestrian detection algorithm and examine its real-time feasibility through simulation. We finally demonstrate the complete algorithm including the gradient descent algorithm, the density estimation algorithm, image acquisition/processing and physical motion on a simulator built on a 3-D animation software and an experimental testbed.
TL;DR: An attitude estimation algorithm adapted to be embedded based on fuzzy logic and a two-classification model and human daily behaviors for experiments for the research of human motion recognition are proposed.
Abstract: Motion pose capture technology can effectively solve the problem of difficulty in defining character motion in the process of 3D animation production and greatly reduce the workload of character motion control, thereby improving the efficiency of animation development and the fidelity of character motion. Motion gesture capture technology is widely used in virtual reality systems, virtual training grounds, and real-time tracking of the motion trajectories of general objects. This paper proposes an attitude estimation algorithm adapted to be embedded. The previous centralized Kalman filter is divided into two-step Kalman filtering. According to the different characteristics of the sensors, they are processed separately to isolate the cross-influence between sensors. An adaptive adjustment method based on fuzzy logic is proposed. The acceleration, angular velocity, and geomagnetic field strength of the environment are used as the input of fuzzy logic to judge the motion state of the carrier and then adjust the covariance matrix of the filter. The adaptive adjustment of the sensor is converted to the recognition of the motion state. For the study of human motion posture capture, this paper designs a verification experiment based on the existing robotic arm in the laboratory. The experiment shows that the studied motion posture capture method has better performance. The human body motion gesture is designed for capturing experiments, and the capture results show that the obtained pose angle information can better restore the human body motion. A visual model of human motion posture capture was established, and after comparing and analyzing with the real situation, it was found that the simulation approach reproduced the motion process of human motion well. For the research of human motion recognition, this paper designs a two-classification model and human daily behaviors for experiments. Experiments show that the accuracy of the two-category human motion gesture capture and recognition has achieved good results. The experimental effect of SVC on the recognition of two classifications is excellent. In the case of using all optimization algorithms, the accuracy rate is higher than 90%, and the final recognition accuracy rate is also higher than 90%. In terms of recognition time, the time required for human motion gesture capture and recognition is less than 2 s.
TL;DR: A gesture-based natural user interface is a preferred method to control a 3D animation compared to a cursor-based interface and not only proved to be more efficient but resulted in a more engaging and enjoyable user experience.
Abstract: This article presents a new natural user interface to control and manipulate a 3D animation using the Kinect. The researchers design a number of gestures that allow the user to play, pause, forward, rewind, scale, and rotate the 3D animation. They also implement a cursor-based traditional interface and compare it with the natural user interface. Both interfaces are extensively evaluated via a user study in terms of both the usability and user experience. Through both quantitative and the qualitative evaluation, they show that a gesture-based natural user interface is a preferred method to control a 3D animation compared to a cursor-based interface. The natural user interface not only proved to be more efficient but resulted in a more engaging and enjoyable user experience.
TL;DR: This paper explores the use of 3D model and animation techniques, combined with narrative techniques, for recreating event-based information to aid understanding and indicates that both forms of3D graphical techniques positively supported users in terms of cognitive load, recall, and engagement over reading text.
Abstract: Three-dimensional rendering technologies have long been utilized for explanatory purposes in scientific visualization and related areas. Their applications to wider fields, however, have often been limited. In this paper, we explore the use of 3D model and animation techniques, combined with narrative techniques, for recreating event-based information to aid understanding. An empirical experiment was conducted which examined the effectiveness of 3D model images and 3D animation videos compared to reading narratives in textual form. The results indicated that both forms of 3D graphical techniques positively supported users in terms of cognitive load, recall, and engagement over reading text.
TL;DR: Using multidimensional scaling, this study analyzed the differences in the facial features of 332 three-dimensional characters from the top-100 grossing animation films at the international box office and indicated that face aspect ratio, nose length, and distance from mouth corner to face edge are three crucial features in character design.
Abstract: The facial features of animation characters convey information about the characters to audiences and make the characters more believable. Using multidimensional scaling, this study analyzed the dif...
TL;DR: This paper deals with 2D image transformations from a perspective of a 3D heterogeneous shape modeling and computer animation.
Abstract: This paper deals with 2D image transformations from a perspective of a 3D heterogeneous shape modeling and computer animation. Shape and image morphing techniques have attracted a lot of attention ...
TL;DR: A method to rapidly convert 3D dynamic graphics produced by 3D animation software into stereoscopic display suitable for the Android platform is presented, with details of an algorithm to generate double-Viewpoint image sequences from single-viewpoint 3Dynamic graphics, and a method for compositing stereoscopic displays from double-view point image sequences.
Abstract: With the widespread use of smart terminals, the convenient use of stereoscopic video display on mobile platforms is urgently needed by more and more people. This study presents a method to rapidly convert 3D dynamic graphics produced by 3D animation software into stereoscopic display suitable for the Android platform, with details of an algorithm to generate double-viewpoint image sequences from single-viewpoint 3D dynamic graphics, and a method for compositing stereoscopic display from double-viewpoint image sequences. It developes a program on the basis of popular animation software to implement this method for realizing automatic generation of dynamic 3D graphics and for outputting composite images that conforms to the binocular characteristics of stereoscopic displays. As shown by experiments, the methods presented by this study, produce better results at a faster speed and provide stronger support for the production of high-quality stereo videos.
TL;DR: A survey on the role of digital human-like characters in virtual worlds, both as counterparts of real human users and as embodied agents driven by artificial intelligence is presented.
Abstract: As human beings, we are so used to interacting with each other that any world without humans would feel alien to us, including digital ones. In this article, we present a survey on the role of digital human-like characters in virtual worlds, both as counterparts of real human users and as embodied agents driven by artificial intelligence. The main issues related to 3-D graphics, physics, animation, and behavioral modeling are introduced, suggesting wherever available different alternatives and related development pipelines. A sizeable list of examples illustrating the use of virtual humans in different application sectors is then presented, focusing in particular on four domains: environmental design, training, cultural heritage, and healthcare.