Top 281 papers published in the topic of Block-matching algorithm in 1999

Showing papers on "Block-matching algorithm published in 1999"

Efficient video indexing scheme for content-based retrieval

[...]

Hyun Sung Chang, Sanghoon Sull¹, Sang Uk Lee¹•Institutions (1)

01 Dec 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: In this paper, the key frame extraction problem is considered from a set-theoretic point of view, and systematic algorithms are derived to find a compact set of key frames that can represent a video segment for a given degree of fidelity.

...read moreread less

Abstract: Extracting a small number of key frames that can abstract the content of video is very important for efficient browsing and retrieval in video databases. In this paper, the key frame extraction problem is considered from a set-theoretic point of view, and systematic algorithms are derived to find a compact set of key frames that can represent a video segment for a given degree of fidelity. The proposed extraction algorithms can be hierarchically applied to obtain a tree-structured key frame hierarchy that is a multilevel abstract of the video. The key frame hierarchy enables an efficient content-based retrieval by using the depth-first search scheme with pruning., Intensive experiments on a variety of video sequences are presented to demonstrate the improved performance of the proposed algorithms over the existing approaches.

...read moreread less

282 citations

Proceedings Article•10.1109/ICIP.1999.822862•

Automatic caption localization in compressed video

[...]

Yu Zhong¹, Hong-Jiang Zhang², Anil K. Jain³•Institutions (3)

Carnegie Mellon University¹, Hewlett-Packard², Michigan State University³

24 Oct 1999

TL;DR: This method first locates candidate text regions directly in the DCT compressed domain, and then reconstructs the candidate regions for further refinement in the spatial domain, so that only a small amount of decoding is required.

...read moreread less

Abstract: We present a method to automatically locate captions in MPEG video. Caption text regions are segmented from the background using their distinguishing texture characteristics. This method first locates candidate text regions directly in the DCT compressed domain, and then reconstructs the candidate regions for further refinement in the spatial domain. Therefore, only a small amount of decoding is required. The proposed algorithm achieves about 4.0% false reject rate and less than 5.7% false positive rate on a variety of MPEG compressed video containing more than 42,000 frames.

...read moreread less

276 citations

Journal Article•10.1109/30.793546•

Fast digital image stabilizer based on Gray-coded bit-plane matching

[...]

Sung-Jea Ko¹, Sung-Hee Lee¹, Seung-Won Jeon¹, Eui-Sung Kang¹•Institutions (1)

Korea University¹

1 Aug 1999

TL;DR: Experimental results indicate that the proposed digital image stabilizer is a computationally efficient alternative to existing DIS systems.

...read moreread less

Abstract: A fast digital image stabilizer based on the Gray-coded bit-plane matching is proposed which is robust to irregular conditions such as moving objects and intentional panning. The proposed digital image stabilization (DIS) system performs motion estimation using the Gray-coded bit-plane of video sequences, greatly reducing the computational load. This motion estimation method can be realized using only binary Boolean functions which have significantly reduced computational complexity, while the accuracy of motion estimation is maintained. In order to further improve the computational efficiency, the Gray-coded bit-plane matching with the three-step search (3SS) is proposed. Experimental results indicate that the proposed digital image stabilizer is a computationally efficient alternative to existing DIS systems.

...read moreread less

244 citations

Journal Article•10.1109/76.785730•

Adaptive motion-vector resampling for compressed video downscaling

[...]

Bo Shen¹, Ishwar K. Sethi², Bhaskaran Vasudev•Institutions (2)

Hewlett-Packard¹, Wayne State University²

01 Sep 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work proposes an alternative compressed domain-based approach that computes motion vectors for the downscaled (N/ 2xN/2) video sequence directly from the original motion vectors from the N/spl times/N video sequence, and discovers that the scheme produces better results by weighting the originalmotion vectors adaptively.

...read moreread less

Abstract: Digital video is becoming widely available in compressed form, such as a motion JPEG or MPEG coded bitstream. In applications such as video browsing or picture-in-picture, or in transcoding for a lower bit rate, there is a need to downscale the video prior to its transmission. In such instances, the conventional approach to generating a downscaled video bitstream at the video server would be to first decompress the video, perform the downscaling operation in the pixel domain, and then recompress it as, say, an MPEG, bitstream for efficient delivery. This process is computationally expensive due to the motion-estimation process needed during the recompression phase. We propose an alternative compressed domain-based approach that computes motion vectors for the downscaled (N/2xN/2) video sequence directly from the original motion vectors for the N/spl times/N video sequence. We further discover that the scheme produces better results by weighting the original motion vectors adaptively. The proposed approach can lead to significant computational savings compared to the conventional spatial (pixel) domain approach. The proposed approach is useful for video severs that provide quality of service in real time for heterogeneous clients.

...read moreread less

198 citations

Journal Article•10.1109/76.809155•

Video segmentation for content-based coding

[...]

T. Meier, King Ngi Ngan¹•Institutions (1)

University of Western Australia¹

01 Dec 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The main assumption underlying the approach is the existence of a dominant global motion that can be assigned to the background that indicates the presence of independently moving physical objects.

...read moreread less

Abstract: To provide multimedia applications with new functionalities, the new video coding standard MPEG-4 relies on a content-based representation. This requires a prior decomposition of sequences into semantically meaningful, physical objects. We formulate this problem as one of separating foreground objects from the background based on motion information. For the object of interest, a 2D binary model is derived and tracked throughout the sequence. The model points consist of edge pixels detected by the Canny operator. To accommodate rotation and changes in shape of the tracked object, the model is updated every frame. These binary models then guide the actual video object plane (VOP) extraction. Thanks to our new boundary postprocessor and the excellent edge localization properties of the Canny operator, the resulting VOP contours are very accurate. Both the model initialization and update stages exploit motion information. The main assumption underlying our approach is the existence of a dominant global motion that can be assigned to the background. Areas that do not follow this background motion indicate the presence of independently moving physical objects. Two alternative methods to identify such objects are presented. The first one employs a morphological motion filter with a new filter criterion, which measures the deviation of the locally estimated optical flow from the corresponding global motion. The second method computes a change detection mask by taking the difference between consecutive frames. The first version is more suitable for sequences with little motion, whereas the second version is better at dealing with faster moving or changing objects. Experimental results demonstrate the performance of our algorithm.

...read moreread less

187 citations

Journal Article•10.1007/S005300050139•

Query by video clip

[...]

Anil K. Jain¹, A. Vailaya¹, Xiong Wei²•Institutions (2)

Michigan State University¹, Hong Kong University of Science and Technology²

01 Sep 1999-Multimedia Systems

TL;DR: Two schemes are proposed: retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features around the key frames, and retrieval using sub-sampled frames is based on matching color and texture features of the sub-Sampled frames.

...read moreread less

Abstract: Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features around the key frames. For each key flame in the query, a similarity value (using color, texture, and motion) is obtained with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly subsample the query Clip as well as the database video. Retrieval is based on matching color and texture features of the subsampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one basketball video as query and a different basketball video as the database show the effectiveness of feature representation and matching schemes.

...read moreread less

185 citations

Patent•

System and method for fine granular scalable video with selective quality enhancement

[...]

Yingwei Chen¹, Hayder Radha¹, Mihaela van der Schaar¹•Institutions (1)

Philips¹

6 Jul 1999

TL;DR: In this paper, an adaptive quantization controller for video encoder comprising a base layer circuit for receiving an input stream of video frames and generating compressed base layer video frames suitable for transmission to a streaming video receiver is presented.

...read moreread less

Abstract: There is disclosed an adaptive quantization controller for use in a video encoder comprising a base layer circuit for receiving an input stream of video frames and generating compressed base layer video frames suitable for transmission to a streaming video receiver and an enhancement layer circuit for receiving the input stream of video frames and a decoded version of the compressed base layer video frames and generating enhancement layer video data associated with, and allocated to, corresponding ones of the compressed base layer video frames. The adaptive quantization controller receives at least one quantization parameter from the base layer circuit and, in response thereto, determines a corresponding shifting factor for shifting a bit plane associated with the enhancement layer video data. The adaptive quantizaion controller also modifies a data field in the enhancement layer video data to cause the video streaming receiver to assign a higher decoding priority to the shifted bit plane.

...read moreread less

182 citations

Journal Article•10.1109/76.809158•

Long-term global motion estimation and its application for sprite coding, content description, and segmentation

[...]

Aljosa Smolic¹, Thomas Sikora, Jens-Rainer Ohm•Institutions (1)

Heinrich Hertz Institute¹

01 Dec 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The presented results indicate that the proposed technique is a very accurate and robust approach for long-term global motion estimation, which can be used for applications such as MPEG-4 sprite coding or MPEG-7 motion description, and the efficiency of globalmotion estimation can be significantly increased if a higher order motion model is applied.

...read moreread less

Abstract: We present a new technique for long-term global motion estimation of image objects. The estimated motion parameters describe the continuous and time-consistent motion over the whole sequence relatively to a fixed reference coordinate system. The proposed method is suitable for the estimation of affine motion parameters as well as for higher order motion models like the parabolic model-combining the advantages of feature matching and optical flow techniques. A hierarchical strategy is applied for the estimation, first translation, affine motion, and finally higher order motion parameters, which is robust and computationally efficient. A closed-loop prediction scheme is applied to avoid the problem of error accumulation in long-term motion estimation. The presented results indicate that the proposed technique is a very accurate and robust approach for long-term global motion estimation, which can be used for applications such as MPEG-4 sprite coding or MPEG-7 motion description. We also show that the efficiency of global motion estimation can be significantly increased if a higher order motion model is applied, and we present a new sprite coding scheme for on-line applications. We further demonstrate that the proposed estimator serves as a powerful tool for segmentation of video sequences.

...read moreread less

168 citations

Journal Article•10.1109/6046.784468•

Performance evaluation of smoothing algorithms for transmitting prerecorded variable-bit-rate video

[...]

Wu-chi Feng¹, Jennifer Rexford²•Institutions (2)

Ohio State University¹, AT&T²

01 Sep 1999-IEEE Transactions on Multimedia

TL;DR: This paper compares the transmission schedules generated by the various smoothing algorithms, based on a collection of metrics that relate directly to the server, network, and client resources necessary for the transmission, transport, and playback of prerecorded video.

...read moreread less

Abstract: The transfer of prerecorded, compressed variable-bit-rate video requires multimedia services to support large fluctuations in bandwidth requirements on multiple time scales. Bandwidth smoothing techniques can reduce the burstiness of a variable-bit-rate stream by transmitting data at a series of fixed rates, simplifying the allocation of resources in video servers and the communication network. This paper compares the transmission schedules generated by the various smoothing algorithms, based on a collection of metrics that relate directly to the server, network, and client resources necessary for the transmission, transport, and playback of prerecorded video. Using MPEG-1 and MJPEG video data and a range of client buffer sizes, we investigate the interplay between the performance metrics and the smoothing algorithms. The results highlight the unique strengths and weaknesses of each bandwidth smoothing algorithm, as well as the characteristics of a diverse set of video clips.

...read moreread less

139 citations

Journal Article•10.1006/CVIU.1999.0761•

A Stochastic Framework for Optimal Key Frame Extraction from MPEG Video Databases

[...]

Yannis Avrithis¹, Anastasios Doulamis¹, Nikolaos Doulamis¹, Stefanos Kollias¹•Institutions (1)

National Technical University of Athens¹

01 Jul 1999-Computer Vision and Image Understanding

TL;DR: A video content representation framework is proposed for extracting limited, but meaningful, information of video data, directly from the MPEG compressed domain, based on a multiresolution implementation of the recursive shortest spanning tree (RSST) algorithm.

...read moreread less

98 citations

Proceedings Article•10.1117/12.373565•

Identifying sports videos using replay, text, and camera motion features

[...]

Vikrant Kobla¹, Daniel DeMenthon¹, David Doermann¹•Institutions (1)

University of Maryland, College Park¹

23 Dec 1999-Storage and Retrieval for Image and Video Databases

TL;DR: The extraction of features that enable identification of sports videos directly from the compressed domain of MPEG video, including detecting the presence of action replays, determining the amount of scene text in vide, and calculating various statistics on camera and/or object motion are discussed.

...read moreread less

Abstract: Automated classification of digital video is emerging as an important piece of the puzzle in the design of content management systems for digital libraries. The ability to classify videos into various classes such as sports, news, movies, or documentaries, increases the efficiency of indexing, browsing, and retrieval of video in large databases. In this paper, we discuss the extraction of features that enable identification of sports videos directly from the compressed domain of MPEG video. These features include detecting the presence of action replays, determining the amount of scene text in vide, and calculating various statistics on camera and/or object motion. The features are derived from the macroblock, motion,and bit-rate information that is readily accessible from MPEG video with very minimal decoding, leading to substantial gains in processing speeds. Full-decoding of selective frames is required only for text analysis. A decision tree classifier built using these features is able to identify sports clips with an accuracy of about 93 percent.

...read moreread less

Patent•

Method and apparatus for the detection of motion in video

[...]

James G. Hanko¹, Duane Northcutt¹, Gerard A. Wall¹, Lawrence L. Butcher¹•Institutions (1)

Sun Microsystems¹

30 Jun 1999

TL;DR: In this article, the pixel difference for a pixel exceeds an applicable pixel difference threshold, the pixel is considered to be "different", and if the number of "different" pixels for a frame exceeds a certain threshold, motion has occurred, and a motion detection signal is emitted.

...read moreread less

Abstract: The present invention comprises a method and apparatus for detection of motion in video in which frames from an incoming video stream are digitized. The pixels of each incoming digitized frame are compared to the corresponding pixels of a reference frame, and differences between incoming pixels and reference pixels are determined. If the pixel difference for a pixel exceeds an applicable pixel difference threshold, the pixel is considered to be 'different'. If the number of 'different' pixels for a frame exceeds an applicable frame difference threshold, motion is considered to have occurred, and a motion detection signal is emitted. In one or more other embodiments, the applicable frame difference threshold is adjusted depending upon the current average motion being exhibited by the most recent frames, thereby taking into account 'ambient' motion and minimizing the effects of phase lag. In one or more embodiments, different pixel difference thresholds may be assigned to different pixels or groups of pixels, thereby making certain regions of a camera's field of view more or less sensitive to motion. In one or more embodiments of the invention, a new reference frame is selected when the first frame that exhibits no motion occurs after one or more frames that exhibit motion.

...read moreread less

Proceedings Article•10.1109/MMSP.1999.793811•

A stochastic framework for optimal key frame extraction from MPEG video databases

[...]

Nikolaos Doulamis¹, Anastasios Doulamis, Yannis Avrithis², Stefanos Kollias•Institutions (2)

National and Kapodistrian University of Athens¹, National Technical University of Athens²

1 Jan 1999

TL;DR: This approach is based on minimization of a cross-correlation criterion among video frames of a given shot so as to be located a set of minimally correlated feature vectors.

...read moreread less

Abstract: A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered together using a fuzzy formulation and extraction of several key frames is performed for each shot in a content-based rate sampling framework. In particular, our approach is based on minimization of a cross-correlation criterion among video frames of a given shot so as to be located a set of minimally correlated feature vectors. Experimental results indicating the good performance of the proposed scheme are also presented.

...read moreread less

Patent•

Method of selecting key-frames from a video sequence

[...]

Itzhak Wilf, Ovadya Menadeva, Hayit Greenspan

24 Mar 1999

TL;DR: In this article, a method of selecting key-frames from a video sequence by comparing each frame in the video sequence with respect to its preceding and subsequent key-frame for redundancy where the comparison involves region and motion analysis is presented.

...read moreread less

Abstract: A method of selecting key-frames (230) from a video sequence (210, 215) by comparing each frame in the video sequence with respect to its preceding and subsequent key-frames for redundancy where the comparison involves region and motion analysis. The video sequence is optionally pre-processed to detect graphic overlay. The key-frame set is optionally post-processed (250) to optimize the resulting set for face or other object recognition.

...read moreread less

Patent•

Hdtv up converter

[...]

Cong Toai Kieu, Chon Tam Le Dinh, Daniel Poirier

1 Apr 1999

TL;DR: In this article, the authors proposed a method for converting a standard video signal having 59.94 fields per second into an HDTV video signal with 60.00 field per second by adding a number of video fields into each sequence of 1000 video fields.

...read moreread less

Abstract: An electronic apparatus for converting a standard video signal having 59.94 fields per second into an HDTV video signal having 60.00 fields per second, by adding a number of video fields into each sequence of 1000 video fields. The apparatus detects the best moment for adding the new video field, so that the human eye does not perceive an abrupt change in the video image, by detecting the best motion conditions which occurs either when the image motion is high or very low. For adding the new video field, the apparatus uses an interpolation technique for creating two interpolated video fields which are inserted in place of one existing video field which is deleted. The apparatus also comprises a de-interlacer module for deinterlacing the 60 Hz video image, by using an advanced interpolation technique for calculating the missing video lines. The proposed technique involves directional interpolations of the missing lines pixels in various directions and selection of the best interpolation direction for the creation of each pixel of the missing video lines. The corresponding de-interlacer apparatus comprises a novel edge direction detector which performs the mentionned interpolations in all interpolating directions and then selects the best direction for performing the interpolation for each interpolated pixel, based on the quality of the performed interpolations.

...read moreread less

Proceedings Article•10.1109/ICIP.1999.817204•

Fast camera motion analysis in MPEG domain

[...]

R.R. Wang¹, Thomas S. Huang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

24 Oct 1999

TL;DR: This paper presents a recursive outlier-rejecting least square algorithm for parametric camera estimation in MPEG-1 and MPEG-2 domain that has a very low time complexity, results show that it works much faster than real-time playback rate and consumes little system resource.

...read moreread less

Abstract: Camera motion estimation is crucial in video analysis and in object tracking query systems when the motion need to be neutralized before object analysis. In today's ever growing amount of video data provided in compressed formats namely MPEG-1 and MPEG-2, it increasingly makes more sense to perform camera estimation in the compressed domain. Much work has gone into the uncompressed domain, but the time to decompress and analyze is simply too great for population of large video databases. This paper presents a recursive outlier-rejecting least square algorithm for parametric camera estimation in MPEG-1 and MPEG-2 domain. The algorithm has a very low time complexity, results show that it works much faster than real-time playback rate and consumes little system resource. Experiments on synthesized video clips and real world video clips show that the algorithm is effective. Experiments are also none on a large set of real-world video clips to analyze the performance and a query system is built in the process.

...read moreread less

Patent•

Method and apparatus for motion estimation for high performance transcoding

[...]

Jeongnam Youn¹, Ming-Ting Sun¹, Chia-Wen Lin¹•Institutions (1)

Industrial Technology Research Institute¹

26 Mar 1999

TL;DR: In this article, methods and systems for generating motion vectors for re-encoding video signals are disclosed, where the motion vector is determined by the sum of a base motion vector and a delta motion vector.

...read moreread less

Abstract: Methods and systems for generating motion vectors for re-encoding video signals are disclosed. The motion vector is determined by the sum of a base motion vector and a delta motion vector. In the case of no frame-skipping, the base motion vector is the incoming motion vector. In the case of frame skipping, the base motion vector is the sum of the motion vectors of the incoming signal since the last re-encoded frame and the current frame. The delta motion vector is optimized by a minimum Sum of the Absolute Difference by searching over a smaller area than if searching for a new motion vector without a delta motion vector. These methods and systems may be used to improve re-encoding digital video signals.

...read moreread less

Journal Article•10.1109/83.806627•

An efficient computation-constrained block-based motion estimation algorithm for low bit rate video coding

[...]

M. Gallant¹, G. Cote, Faouzi Kossentini•Institutions (1)

University of British Columbia¹

01 Dec 1999-IEEE Transactions on Image Processing

TL;DR: An efficient computation constrained block-based motion vector estimation algorithm for low bit rate video coding that offers good tradeoffs between motion estimation distortion and number of computations is presented.

...read moreread less

Abstract: We present an efficient computation constrained block-based motion vector estimation algorithm for low bit rate video coding that yields good tradeoffs between motion estimation distortion and number of computations. A reliable predictor determines the search origin, localizing the search process. An efficient search pattern exploits structural constraints within the motion field. A flexible cost measure used to terminate the search allows simultaneous control of the motion estimation distortion and the computational cost. Experimental results demonstrate the viability of the proposed algorithm in low bit rate video coding applications. The resulting low bit rate video encoder yields essentially the same levels of rate-distortion performance and subjective quality achieved by the UBC H.263+ video coding reference software. However, the proposed motion estimation algorithm provides substantially higher encoding speed as well as graceful computational degradation capabilities.

...read moreread less

Patent•

Hierarchical motion estimation process and system using block-matching and integral projection

[...]

Ching-Fang Chang¹, Naofumi Yanagihara¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

2 Jun 1999

TL;DR: In this paper, methods and systems for obtaining a motion vector between two frames of video image data are disclosed. But they are not used to estimate the motion vector for each macroblock of a current frame with respect to a reference frame in a multi-stage operation.

...read moreread less

Abstract: Methods and systems for obtaining a motion vector between two frames of video image data are disclosed. Specifically, methods and systems of the present invention may be used to estimate a motion vector for each macroblock of a current frame with respect to a reference frame in a multi-stage operation. In a first stage, an application implementing the process of the present invention coarsely searches a first search area of the reference frame to obtain a candidate supermacroblock that best approximates a supermacroblock in the current frame (312). In a second stage, the supermacroblock is divided into a plurality of macroblocks (314). Each of the macroblocks is used to construct search areas which are then searched to obtain a candidate macroblock that best appropriates a macroblock in the current frame (322). Additional stages may be used to further fine-tune the approximation to the macroblock. The methods and systems of the present invention may be used in optimizing digital video encoders, decoders, and video format converters.

...read moreread less

Journal Article•10.1109/6046.748173•

Content-based MPEG video traffic modeling

[...]

Ali Dawood¹, Mohammed Ghanbari¹•Institutions (1)

University of Essex¹

01 Mar 1999-IEEE Transactions on Multimedia

TL;DR: A video model to generate VBR MPEG video traffic based on the scene content description that may be used to generate traffic of any type of video scenes ranging from a low complexity video conferencing to a highly active sport program.

...read moreread less

Abstract: In this paper, we propose a video model to generate VBR MPEG video traffic based on the scene content description. Long sessions of nonhomogeneous video clips are decomposed into homogeneous video shots. The shots are then classified into different classes in terms of their texture and motion complexity. Each shot class was uniquely described with an autoregressive model. Transitions between the shots and their durations have been analyzed. Unlike many classical video source models, this model may be used to generate traffic of any type of video scenes ranging from a low complexity video conferencing to a highly active sport program. The performance of the model is evaluated by measuring the mean cell delay when the generated video traffic is fed to an ATM multiplex buffer.

...read moreread less

Proceedings Article•10.1109/ICIP.1999.819593•

Field-to-frame transcoding with spatial and temporal downsampling

[...]

Susie Wee¹, J.G. Apostolopoulos, Nick Feamster•Institutions (1)

Hewlett-Packard¹

24 Oct 1999

TL;DR: The proposed transcoder achieves improved performance by exploiting the details of the MPEG-2 and H.263 compression standards when performing interlaced to progressive (or field to frame) conversion with spatial downsampling and frame-rate reduction.

...read moreread less

Abstract: We present an algorithm for transcoding high-rate compressed bitstreams containing field-coded interlaced video to lower-rate compressed bitstreams containing frame-coded progressive video We focus on MPEG-2 to H263 transcoding, however these results can be extended to other lower-rate video compression standards including MPEG-4 simple profile and MPEG-1 A conventional approach to the transcoding problem involves decoding the input bitstream, spatially and temporally downsampling the decoded frames, and re-encoding the result The proposed transcoder achieves improved performance by exploiting the details of the MPEG-2 and H263 compression standards when performing interlaced to progressive (or field to frame) conversion with spatial downsampling and frame-rate reduction The transcoder reduces the MPEG-2 decoding requirements by temporally downsampling the data at the bitstream level and reduces the H263 encoding requirements by largely bypassing H263 motion estimation by reusing the motion vectors and coding modes given in the input bitstream In software implementations, the proposed approach achieved a 5/spl times/ speedup over the conventional approach with only a 03 and 05 dB loss in PSNR for the Carousel and Bus sequences

...read moreread less

Error-resilient video compression via multiple state streams

[...]

John G. Apostolopoulos¹•Institutions (1)

Hewlett-Packard¹

1 Jan 1999

TL;DR: This work proposes to combat the problem of incorrect state and error propagation at the decoder by coding the video into multiple independently decodable streams, each with its own prediction process and state, such that if one stream is lost the other streams can still be used to produce usable video.

...read moreread less

Abstract: Video compression enables a number of applications by reducing the required bit rate needed to represent a video sequence, however the compressed video is much more susceptible to errors, e.g. bit errors or packet loss. Conventional video compression standards employ an architecture which we refer to as single-state systems since they have a prediction loop with a single state (e.g. the previous decoded frame) which if lost or corrupted can lead to the loss or severe degradation of all subsequent frames until the state is reinitialized (the prediction is refreshed). We propose to combat this problem of incorrect state and error propagation at the decoder by coding the video into multiple independently decodable streams, each with its own prediction process and state, such that if one stream is lost the other streams can still be used to produce usable video. The correctly received streams provide improved error concealment and, more importantly, enable faster state recovery for the lost stream. This approach is conceptually similar to multiple description coding, e.g. [1], however it differs in the representation used for each description as well as its use of state recovery.

...read moreread less

Journal Article•10.1109/76.754777•

Scene-context-dependent reference-frame placement for MPEG video coding

[...]

A.Y. Lan¹, A.G. Nguyen, Jenq-Neng Hwang•Institutions (1)

University of Washington¹

01 Apr 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The proposed MA-based adaptive reference frame-placement scheme outperforms the standard fixed-reference frame- Placement and adaptive schemes based on histogram of difference and can achieve from 2 to 13.9% savings in bits while maintaining similar quality.

...read moreread less

Abstract: The MPEG video-compression standard effectively exploits spatial, temporal, and coding redundancies in the algorithm. In its generic form, however, only a minimal amount of scene adaptation is performed. Video can be further compressed by taking advantage of scenes where the temporal statistics allow larger interreference-frame distances. This paper proposes the use of motion analysis (MA) to adapt to scene content. The actual picture type [intracoded (I), predicted (P), or bidirectionally coded (B)] decision is made by examining the accumulation of motion measurements since the last reference frame (either I or P) was labeled. The proposed MA-based adaptive reference frame-placement scheme outperforms the standard fixed-reference frame-placement and adaptive schemes based on histogram of difference. When compared with the standard fixed scheme, depending on the video contents, this proposed algorithm can achieve from 2 to 13.9% savings in bits while maintaining similar quality.

...read moreread less

Proceedings Article•10.1109/DCC.1999.755690•

Three-dimensional wavelet coding of video with global motion compensation

[...]

A. Wang¹, Zixiang Xiong², Philip A. Chou¹, Sanjeev Mehrotra¹•Institutions (2)

Microsoft¹, University of Hawaii²

29 Mar 1999

TL;DR: This work introduces global motion compensation for 3D subband video coders, and finds 0.5 to 2 dB gain on sequences with dominant background motion.

...read moreread less

Abstract: Three-dimensional (2D+T) wavelet coding of video using SPIHT has been shown to outperform standard predictive video coders on complex high-motion sequences, and is competitive with standard predictive video coders on simple low-motion sequences. However, on a number of typical moderate-motion sequences characterized by largely rigid motions, 3D SPIHT performs several dB worse than motion-compensated predictive coders, because it is does not take advantage of the real physical motion underlying the scene. We introduce global motion compensation for 3D subband video coders, and find 0.5 to 2 dB gain on sequences with dominant background motion. Our approach is a hybrid of video coding based on sprites, or mosaics, and subband coding.

...read moreread less

Proceedings Article•10.1117/12.337413•

Compressed-domain reverse play of MPEG video streams

[...]

Susie J. Wee¹, Bhaskaran Vasudev•Institutions (1)

Hewlett-Packard¹

22 Jan 1999

TL;DR: The proposed compressed-domain transcoding methods achieve an order of magnitude reduction in computational complexity over the baseline spatial-domain approach and the resulting image quality is within 0.6 dB of the baselineatial- domain approach for a difficult highly detailed computer-generated video sequence.

...read moreread less

Abstract: We present several compressed-domain methods for reverse-play transcoding of MPEG video streams. A reverse-playtranscoder takes any original MPEG IPB bitstream as input and creates an output MPEG IPB bitstream which, when decodedby a generic MPEG decoder, displays the original video frames in reverse order. A baseline spatial-domain method requiresdecoding the MPEG bitstream, storing and reordering the decoded video frames, and re-encoding the reordered video. Theproposed compressed-domain transcoding methods achieve an order of magnitude reduction in computational complexityover the baseline spatial-domain approach. Much of the savings are achieved by using the forward motion vector fieldsavailable in the forward-play MPEG bitstream to efficiently generate the reverse motion vector fields used in the reverse-playMPEG bitstream. Furthermore, the storage requirements of the compressed-domain methods are reduced and the resultingimage quality is within 0.6 dB of the baseline spatial-domain approach for a difficult highly detailed computer-generatedvideo sequence. For more typical video sequences, the resulting image quality is even closer to the baseline spatial-domainapproach.Keywords: Reverse-play, MPEG, Compressed-domain processing, Video compression, Video processing, Transcoding.

...read moreread less

Patent•

Video encoder and video encoding method

[...]

Takeshi Chujoh¹, Toshiaki Watanabe¹•Institutions (1)

Toshiba¹

29 Jan 1999

TL;DR: In this paper, a video encoder comprises an encoding section for selectively performing on an input video signal, an inter-coding mode for interframe coding, and a non-coded mode in which no coding is performed and the previous frame is used for display.

...read moreread less

Abstract: A video encoder comprises an encoding section for selectively performing on an input video signal an intra-coding mode for intraframe coding, an inter-coding mode for interframe coding, and a non-coded mode in which no coding is performed and the previous frame is used for display, a mode selector section for adaptively selecting among the coding modes for each predetermined region in the input video signal, and a refresh section for detecting a motion region from within a frame and setting up an intraframe-coded region for refresh in a portion of a refresh range including the motion region. The refresh section determines which of a motion region having motion and a still region with no motion each of intraframe-coded macroblocks belongs to, determines the refresh range in the next frame on the basis of the result of the determination, and instructs the intra mode to the mode select circuit when a macroblock to be coded belongs to the refresh range.

...read moreread less

Patent•

Adaptive motion estimator

[...]

Hong Lye Oh¹, Yau Wai Lucas Hui¹•Institutions (1)

STMicroelectronics¹

13 May 1999

TL;DR: In this article, a method and apparatus of encoding digital video according to the ISO/IEC MPEG standards (ISO/iEC 11172-2 MPEG-1 and ISO/ IEC 13818-2 MP2) using an adaptive motion estimator is described.

...read moreread less

Abstract: A method and apparatus of encoding digital video according to the ISO/IEC MPEG standards (ISO/IEC 11172-2 MPEG-1 and ISO/IEC 13818-2 MPEG-2) using an adaptive motion estimator A plurality of global motion vectors are derived from the motion vectors of a previous picture in a sequence, and the global motion vectors are analysed to determine motion characteristics The video encoding is arranged to enable switching among different types of local motion estimators based on the motion characteristics of the moving pictures sequence This gives rise to a motion estimation algorithm that can adaptively change its search range, search area and block matching scheme to suit different types of moving sequences

...read moreread less

Patent•

A method and system for real time feature based motion analysis for key frame selection from a video

[...]

Atac sok Gozde Bozdagi¹, Robert Kamil Bryll¹•Institutions (1)

Xerox¹

14 Dec 1999

TL;DR: In this article, a method and system for real-time converting a dynamic video to a set of static image frames includes segmenting the video into a plurality of frames and significant parts of the frames are selected to comprise interest points.

...read moreread less

Abstract: A method and system for real time converting a dynamic video to a set of static image frames includes segmenting the video into a plurality of frames. Significant parts of the frames are selected to comprise interest points. An operator estimates a motion trajectory of the interest points for real time computing of a global motion. Upon detection of global motion, selected key frames are selected from the set of static frames to represent the dynamic video. Interest points are identified as areas of high gradient and are further minimized by limiting interest points by imposing a grid on image frame and limiting the interest points to one point per grid cell.

...read moreread less

Proceedings Article•10.1109/ICIP.1999.821568•

Efficient motion estimation using spatial and temporal motion vector prediction

[...]

I.R. Ismaeil¹, A. Docef, Faouzi Kossentini, Rabab K. Ward•Institutions (1)

University of British Columbia¹

24 Oct 1999

TL;DR: Experimental results show that spatio-temporal prediction reduces the number of computations performed by the motion search algorithm by 30% for MPEG-2 encoding and by 40% for H.263 encoding.

...read moreread less

Abstract: This paper presents a motion estimation technique for the coding of moving video sequences that is based on spatial and temporal prediction The motion vector of a moving object is tracked using spatial and temporal prediction and used as a starting point for the motion estimation search algorithm The predicted motion vector is selected from several candidate motion vectors according to the block matching criterion Experimental results show that spatio-temporal prediction reduces the number of computations performed by the motion search algorithm by 30% for MPEG-2 encoding and by 40% for H263 encoding

...read moreread less

Patent•

Stereoscopic video image generating method

[...]

Toru Sugiyama, 徹杉山

4 Mar 1999

TL;DR: In this article, a relative motion vector is calculated by subtracting the representative motion vector from the motion vector of the two-dimensional video image and the depth information obtained by this estimation is used to generate a stereoscopic video signal.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To properly obtain depth information of an object image with a simple means in the case of converting a two-dimensional video signal having a plurality of unknown parameters such as a photographing condition, a motion of each object image in a picture and a deformed object into a three-dimensional video signal. SOLUTION: A two-dimensional video signal is separated into a signal denoting a background area image and a signal denoting other area image. A representative motion vector of the background area image is calculated from a motion vector of the two-dimensional video image and a motion vector of the background area image, and a relative motion vector is calculated by subtracting the representative motion vector from the motion vector of the two-dimensional video image. Depth information of the video image of the two-dimensional video signal is estimated by using the relative motion vector and the depth information obtained by this estimation is used to generate a stereoscopic video signal.

...read moreread less

...

Expand