TL;DR: A predicted hole mapping (PHM) algorithm is presented, which requires no filling priority and smoothing operation, allowing parallel computation that facilitates a real-time 3D conversion system.
Abstract: Three-dimensional (3D) display technologies make great process in recent years. View synthesis for 3D content requires the hole-filling, which is a challenging task. The increase of resolution and the number of views for view synthesis brings new challenges on memory and processing speed. A predicted hole mapping (PHM) algorithm is presented, which requires no filling priority and smoothing operation, allowing parallel computation that facilitates a real-time 3D conversion system. In experiments, the proposed PHM is evaluated and compared with some other methods in terms of peak signal to noise ratio and structural similarity index measurement, and the result shows the advantages in the numbers. The method can operate on the 32-view display with 4K × 2K resolution in real time on GPU.
TL;DR: This thesis proposes a multi-camera depth map estimation ASIC implemented in 28 nm, which is capable of computing in real-time up to 2K resolution depth maps at 32 fps with up to 256-pixel disparity range using two/three cameras.
Abstract: The capability to process high-resolution videos in real-time is becoming more important in a wide variety of applications such as autonomous vehicles, virtual reality or intelligent surveillance systems. The high-accuracy and complex video processing algorithms needed in these applications led to increased challenges for the system design, due to the amount of computations to be processed instantly. Furthermore, video processing algorithms operate on large amounts of data, but storing this data in dense off-chip memories leads to difficulties to meet bandwidth requirements. Hence, high density embedded memories are usually required to temporally store data on-chip, close to the processing units. However, on-chip embedded memories often dominate most of the silicon real-estate and power budget of modern video processing system-on-chips. Considering the current trend toward videos of higher resolutions and faster frame rates, these challenges are expected to dramatically increase in the future. One of the most important kernels required in modern video processing systems is the depth perception, since depth information is needed as input for many advanced video processing algorithms. Depth maps can be created using stereo-matching, which denotes the problem of finding dense correspondences in pairs of images. However, computing high-quality dense depth maps in real-time, on high-resolution images and at high-frame rate is challenging due to the computational complexity of stereo-matching algorithms. Furthermore, their need for large memory sizes and bandwidth limits the performance of depth estimation units, increases their power consumption, and renders them challenging for system integration. In this thesis, we develop task-specific solutions from the algorithmic level to the circuit level that accelerate the computation operations and data transfers, and optimize the on-chip data storage of such depth estimation units. First, we present hardware-oriented stereo-matching algorithms and their hardware implementations, which are tailored to increase parallelism for high throughput while using only on-chip memory to produce high-quality, high-resolution depth maps. Based on that, we propose a multi-camera depth map estimation ASIC implemented in 28 nm, which is capable of computing in real-time up to 2K resolution depth maps at 32 fps with up to 256-pixel disparity range using two/three cameras. Our design achieves the highest disparity range capability at the lowest power consumption and highest frame rate compared to other designs reported in the literature, while computing high-quality depth maps. It also features a stream-in/out interface to be easily integrated in existing vision systems, without additional overhead.
TL;DR: Two- and three color versions of the OLED microdisplay dedicated to electronic viewfinders for digital vision systems that allow to merge high resolution monochrome images e.g., from an infrared sensor for image fusion or for adding colored graphical overlays are developed.
Abstract: We developed a 0.61'' diagonal OLED microdisplay dedicated to electronic viewfinders for digital vision systems, e.g.
for security or other professional applications. The microdisplay has a very high resolution of 5.4 million subpixels and
combines excellent image quality with low power consumption and a 10bit per color digital input. Subpixel pitch is
4.7x4.7μm². Thanks to the versatile architecture of the underlying ASIC circuit, the device can be easily adapted to
different applications and image formats: In the standard full color version, the resulting resolution is 1300 by 1044
pixels (SXGA). In a monochrome version, the resolution is 2600 by 2088 independent pixels, enabling e.g. digital night
vision at full 2K by 2K resolution. In addition to this, we developed two- and three color versions of the display that
allow to merge high resolution monochrome images e.g.in 2K by 2K resolution with lower resolution images e.g., from
an infrared sensor for image fusion or for adding colored graphical overlays.
TL;DR: A single-chip trinocular disparity estimation processor, capable of computing in real-time up to 2048×1080 resolution depth maps at 32fps with up to 256-pixel disparity range using two/three CMOS camera sensors, that provides the highest reported disparity range capability at the lowest power consumption and highest frame rate.
Abstract: This paper presents a single-chip trinocular disparity estimation processor, capable of computing in real-time up to 2048×1080 resolution depth maps at 32fps with up to 256-pixel disparity range using two/three CMOS camera sensors. The most important feature of the presented design is that the chip is based on a trinocular adaptive window matching process that requires very limited on-chip memory, and completely avoids the usage of any external memory. Moreover, it provides the highest reported disparity range capability at the lowest power consumption and highest frame rate, while computing high-quality disparity results. It features a stream-in/out interface to be easily integrated in existing vision systems, without additional overhead, and offers a dynamically scalable tradeoff between throughput, resolution and disparity range. The single-chip is fabricated in 28nm CMOS technology, has a die area of 5.96mm2 and a power consumption of 380mW at 300MHz clock frequency.
TL;DR: With the state-of-the-art hardware and the efficient algorithm, a naked-eye-3D display system with a LED screen size of 6m × 1.8m is achieved and vivid 3D experience is perceived.
Abstract: Three-dimensional (3D) display technologies make great progress in recent years, and lenticular array based 3D display is a relatively mature technology, which most likely to commercial. In naked-eye-3D display, the screen size is one of the most important factors that affect the viewing experience. In order to construct a large-size naked-eye-3D display system, the LED display is used. However, the pixel misalignment is an inherent defect of the LED screen, which will influences the rendering quality. To address this issue, an efficient image synthesis algorithm is proposed. The Texture-Plus-Depth(T+D) format is chosen for the display content, and the modified Depth Image Based Rendering (DIBR) method is proposed to synthesize new views. In order to achieve realtime, the whole algorithm is implemented on GPU. With the state-of-the-art hardware and the efficient algorithm, a naked-eye-3D display system with a LED screen size of 6m × 1.8m is achieved. Experiment shows that the algorithm can process the 43-view 3D video with 4K × 2K resolution in real time on GPU, and vivid 3D experience is perceived.