TL;DR: An attempt to implement a keyframe-based SLAM system on a camera phone (specifically, the Apple iPhone 3G) is described and early results demonstrate a system capable of generating and augmenting small maps, albeit with reduced accuracy and robustness compared to SLAM on a PC.
Abstract: Camera phones are a promising platform for hand-held augmented reality. As their computational resources grow, they are becoming increasingly suitable for visual tracking tasks. At the same time, they still offer considerable challenges: Their cameras offer a narrow field-of-view not best suitable for robust tracking; images are often received at less than 15Hz; long exposure times result in significant motion blur; and finally, a rolling shutter causes severe smearing effects. This paper describes an attempt to implement a keyframe-based SLAMsystem on a camera phone (specifically, the Apple iPhone 3G). We describe a series of adaptations to the Parallel Tracking and Mapping system to mitigate the impact of the device's imaging deficiencies. Early results demonstrate a system capable of generating and augmenting small maps, albeit with reduced accuracy and robustness compared to SLAM on a PC.
TL;DR: The feasibility of the design through an analytical model, the viability of the designs through a prototype system, the challenges to a practical deployment including usability and scalability, and decimeter-level accuracy in both carefully controlled and more realistic human mobility scenarios are explored.
Abstract: We explore the indoor positioning problem with unmodified smartphones and slightly-modified commercial LED luminaires. The luminaires-modified to allow rapid, on-off keying-transmit their identifiers and/or locations encoded in human-imperceptible optical pulses. A camera-equipped smartphone, using just a single image frame capture, can detect the presence of the luminaires in the image, decode their transmitted identifiers and/or locations, and determine the smartphone's location and orientation relative to the luminaires. Continuous image capture and processing enables continuous position updates. The key insights underlying this work are (i) the driver circuits of emerging LED lighting systems can be easily modified to transmit data through on-off keying; (ii) the rolling shutter effect of CMOS imagers can be leveraged to receive many bits of data encoded in the optical transmissions with just a single frame capture, (iii) a camera is intrinsically an angle-of-arrival sensor, so the projection of multiple nearby light sources with known positions onto a camera's image plane can be framed as an instance of a sufficiently-constrained angle-of-arrival localization problem, and (iv) this problem can be solved with optimization techniques. We explore the feasibility of the design through an analytical model, demonstrate the viability of the design through a prototype system, discuss the challenges to a practical deployment including usability and scalability, and demonstrate decimeter-level accuracy in both carefully controlled and more realistic human mobility scenarios.
TL;DR: A novel scheme for data reception in a mobile phone using visible light communications (VLC) is proposed, exploiting the rolling shutter effect of CMOS sensors, and a data rate much higher than the camera frame rate is achieved.
Abstract: In this paper, a novel scheme for data reception in a mobile phone using visible light communications (VLC) is proposed. The camera of the smartphone is used as a receiver in order to capture the continuous changes in state (on-off) of the light, which are invisible to the human eye. The information is captured in the camera in the form of light and dark bands which are then decoded by the smartphone and the received message is displayed. By exploiting the rolling shutter effect of CMOS sensors, a data rate much higher than the camera frame rate is achieved.
TL;DR: Stereo Direct Sparse Odometry (Stereo DSO) as discussed by the authors integrates constraints from static stereo into the bundle adjustment pipeline of temporal multi-view stereo to improve tracking accuracy and robustness.
Abstract: We propose Stereo Direct Sparse Odometry (Stereo DSO) as a novel method for highly accurate real-time visual odometry estimation of large-scale environments from stereo cameras. It jointly optimizes for all the model parameters within the active window, including the intrinsic/extrinsic camera parameters of all keyframes and the depth values of all selected pixels. In particular, we propose a novel approach to integrate constraints from static stereo into the bundle adjustment pipeline of temporal multi-view stereo. Real-time optimization is realized by sampling pixels uniformly from image regions with sufficient intensity gradient. Fixed-baseline stereo resolves scale drift. It also reduces the sensitivities to large optical flow and to rolling shutter effect which are known shortcomings of direct image alignment methods. Quantitative evaluation demonstrates that the proposed Stereo DSO outperforms existing state-of-the-art visual odometry methods both in terms of tracking accuracy and robustness. Moreover, our method delivers a more precise metric 3D reconstruction than previous dense/semi-dense direct approaches while providing a higher reconstruction density than feature-based methods.
TL;DR: A novel video stabilization method which models camera motion with a bundle of (multiple) camera paths based on a mesh-based, spatially-variant motion representation and an adaptive, space-time path optimization and introduces the 'as-similar-as-possible' idea to make motion estimation more robust.
Abstract: We present a novel video stabilization method which models camera motion with a bundle of (multiple) camera paths. The proposed model is based on a mesh-based, spatially-variant motion representation and an adaptive, space-time path optimization. Our motion representation allows us to fundamentally handle parallax and rolling shutter effects while it does not require long feature trajectories or sparse 3D reconstruction. We introduce the 'as-similar-as-possible' idea to make motion estimation more robust. Our space-time path smoothing adaptively adjusts smoothness strength by considering discontinuities, cropping size and geometrical distortion in a unified optimization framework. The evaluation on a large variety of consumer videos demonstrates the merits of our method.