TL;DR: Algorithms for an advanced robotic surgery system are proposed, which offer motion compensation of the beating heart, which implies the measurement of heart motion, which can be achieved by tracking natural landmarks.
Abstract: Minimally invasive beating-heart surgery offers substantial benefits for the patient, compared to conventional open surgery. Nevertheless, the motion of the heart poses increased requirements to the surgeon. To support the surgeon, algorithms for an advanced robotic surgery system are proposed, which offer motion compensation of the beating heart. This implies the measurement of heart motion, which can be achieved by tracking natural landmarks. In most cases, the investigated affine tracking scheme can be reduced to an efficient block matching algorithm allowing for realtime tracking of multiple landmarks. Fourier analysis of the motion parameters shows two dominant peaks, which correspond to the heart and respiration rates of the patient. The robustness in case of disturbance or occlusion can be improved by specially developed prediction schemes. Local prediction is well suited for the detection of single tracking outliers. A global prediction scheme takes several landmarks into account simultaneously and is able to bridge longer disturbances. As the heart motion is strongly correlated with the patient's electrocardiogram and respiration pressure signal, this information is included in a novel robust multisensor prediction scheme. Prediction results are compared to those of an artificial neural network and of a linear prediction approach, which shows the superior performance of the proposed algorithms.
TL;DR: The proposed method minimizes the fitting error between the input motion vectors and the motion vectors generated from the estimated motion model using the Newton-Raphson method with outlier rejections.
Abstract: Global motion estimation is a powerful tool widely used in video processing and compression as well as in computer vision areas. We propose a new approach for estimating global motions from coarsely sampled motion vector fields. The proposed method minimizes the fitting error between the input motion vectors and the motion vectors generated from the estimated motion model using the Newton-Raphson method with outlier rejections. Applications of the proposed method in video coding include fast global motion estimation for MPEG-4 Advanced Simple Profile coding, MPEG-2 to MPEG-4 ASP transcoding, and error concealments. Simulation results and analyses are provided for the proposed method and the applications, which show the effectiveness of the method in terms of accuracy, robustness, and speed.
TL;DR: In this article, the authors describe methods and integrated systems for camera motion analysis and moving object analysis and methods of extracting semantics mainly from camera motion parameters in videos and video segments without shot changes.
Abstract: Methods and integrated systems for camera motion analysis and moving object analysis and methods of extracting semantics mainly from camera motion parameters in videos and video segments without shot changes are described. Typical examples of such videos are a home video taken by a digital camera and a segment, or clip, of a professional video or film. The extracted semantics can be directly used in a number of video/image understanding and management applications, such as annotation, browsing, editing, frame enhancement, key-frame extraction, panorama generation, printing, retrieval, summarization. Automatic methods of detecting and tracking moving objects that do not rely on a priori knowledge of the objects are also described. The methods can be executed in real time.
TL;DR: In this paper, Zhao et al. proposed a robust video stabilization method that produces full-frame stabilized videos with good visual quality by filling in missing image parts by locally aligning image data of neighboring frames.
Abstract: Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing low resolution stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighbouring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos.
TL;DR: A system for automatically extracting the region of interest (ROI) and controlling virtual cameras' control based on panoramic video that targets applications such as classroom lectures and video conferencing is presented.
Abstract: We present a system for automatically extracting the region of interest (ROI) and controlling virtual cameras' control based on panoramic video. It targets applications such as classroom lectures and video conferencing. For capturing panoramic video, we use the FlyCam system that produces high resolution, wide-angle video by stitching video images from multiple stationary cameras. To generate conventional video, a region of interest can be cropped from the panoramic video. We propose methods for ROI detection, tracking, and virtual camera control that work in both the uncompressed and compressed domains. The ROI is located from motion and color information in the uncompressed domain and macroblock information in the compressed domain, and tracked using a Kalman filter. This results in virtual camera control that simulates human controlled video recording. The system has no physical camera motion and the virtual camera parameters are readily available for video indexing.
TL;DR: In this article, the video encoding method includes determining one of inter predictive coding and intra predictive coding mode as a coding mode for each block in an input video frame, generating a predicted frame for the input video frames based on predicted blocks obtained according to the determined coding mode, and encoding the video frame based on the predicted frame.
Abstract: Video coding and decoding methods and video encoder and decoder are provided. The video encoding method includes determining one of inter predictive coding and intra predictive coding mode as a coding mode for each block in an input video frame, generating a predicted frame for the input video frame based on predicted blocks obtained according to the determined coding mode, and encoding the input video frame based on the predicted frame. When the intra predictive coding mode is determined as the coding mode, an intra basis block composed of representative values of a block is generated for a block and the intra basis block is interpolated to generate an intra predicted block for the block.
TL;DR: An early termination algorithm is proposed that predicts the best motion vector by examining only one search point, and can be stopped early, and then a large number of search points can be skipped.
Abstract: The H.264 video coding standard provides considerably higher coding efficiency than previous standards do, whereas its complexity is significantly increased at the same time. In an H.264 encoder, the most time-consuming component is variable block-size motion estimation. To reduce the complexity of motion estimation, an early termination algorithm is proposed in this paper. It predicts the best motion vector by examining only one search point. With the proposed method, some of the motion searches can be stopped early, and then a large number of search points can be skipped. The proposed method can work with any fast motion estimation algorithm. Experiments are carried out with a fast motion estimation algorithm that has been adopted by H.264. Results show that significant complexity reduction is achieved while the degradation in video quality is negligible.
TL;DR: A novel method to generate plausible video sequences after removing relatively large objects from the original videos is proposed by applying motion layer segmentation method and a set of synthesized layers are generated.
Abstract: This paper proposes a novel method to generate plausible video sequences after removing relatively large objects from the original videos. In order to maintain temporal coherence among the frames, a motion layer segmentation method is applied. Then, a set of synthesized layers are generated by applying motion compensation and region completion algorithm. Finally, a new video, in which the selected object is removed, is plausibly rendered given the synthesized layers and the motion parameters. A number of example videos are shown in the results to demonstrate the effectiveness of our method
TL;DR: A novel system-level error tolerance approach specifically targeted for multimedia compression algorithms, focusing on the motion estimation process performed by most video encoders, is proposed.
Abstract: We propose a novel system-level error tolerance approach specifically targeted for multimedia compression algorithms In particular, we focus on the motion estimation process performed by most video encoders While current manufacturing process classifies fabricated systems into two classes, namely, perfect and imperfect, our proposed scheme employs categories which are based on acceptable/unacceptable performance degradation By enabling the use of systems that would otherwise have been discarded we seek to increase the overall yield rate in the system fabrication process To achieve this, we propose testing algorithms that aim at determining if faults in a given chip produce acceptable performance degradation, and we propose a technique which can cancel the effect of those among the acceptable faults that can be compensated
TL;DR: In this paper, an Encoder Assisted Frame Rate Up Conversion (EA-FRUC) system that utilizes video coding and pre-processing operations at the video encoder to exploit the FRUC processing that will occur in the decoder in order to improve compression efficiency and reconstructed video quality is disclosed.
Abstract: An Encoder Assisted Frame Rate Up Conversion (EA-FRUC) system that utilizes video coding and pre-processing operations at the video encoder to exploit the FRUC processing that will occur in the decoder in order to improve compression efficiency and reconstructed video quality is disclosed. One operation of the EA-FRUC system involves determining whether to encode a frame in a sequence of frames of a video content by determining a spatial activity in a frame of the sequence of frames; determining a temporal activity in the frame; determining a spatio-temporal activity in the frame based on the determined spatial activity and the determined temporal activity; determining a level of a redundancy in the source frame based on at least one of the determined spatial activity, the determined temporal activity, and the determined spatio-temporal activity; and, encoding the non-redundant information in the frame if the determined redundancy is within predetermined thresholds.
TL;DR: In this paper, an encoder comprising an input for inputting video signal to be encoded to form an encoded video signal comprising pictures of at least a first coded video sequence and a second encoded video sequence, a hypothetical decoder for hypothetically decoding encoded video signals, an encoded picture buffer, and a decoded picture buffer is defined.
Abstract: An encoder comprising an input for inputting video signal to be encoded to form an encoded video signal comprising pictures of at least a first coded video sequence and a second coded video sequence, a hypothetical decoder for hypothetically decoding encoded video signal, an encoded picture buffer, and a decoded picture buffer, and a definer for defining a parameter indicative of the temporal difference between the last picture of the first coded video sequence and the first picture of the second coded video sequence in output/display order.
TL;DR: In this paper, a first representation of a video stream is received that includes video frames, the representation expressing the video frames at a relatively high pixel resolution is provided to a video playing device and additional information that represents at least a portion of the region of interest at a resolution level that is higher than the relatively low pixel resolution.
Abstract: A first representation of a video stream is received that includes video frames, the representation expressing the video frames at a relatively high pixel resolution. At least one of the video frames is detected to include a region of interest. A second representation of the video stream that expresses the video frames at a relatively low pixel resolution is provided to a video playing device. Included with the second representation is additional information that represents at least a portion of the region of interest at a resolution level that is higher than the relatively low pixel resolution.
TL;DR: A new motion compensation design is presented to overcome the large calculation time of the complicated motion vector prediction (MVP) algorithm and high motion resolution in H.264/AVC.
Abstract: Motion compensation is always the main bottleneck in real-time or high quality video applications; thus, fast and efficient motion compensation is necessary. A new motion compensation design is presented to overcome the large calculation time of the complicated motion vector prediction (MVP) algorithm and high motion resolution in H.264/AVC. By applying 4/spl times/4 block based parallelism and context switch buffers in our design, it can efficiently reduce memory access and increase data reuse probability such that real-time decoding can be achieved with 1080HD (1920/spl times/1088) at 100 MHz and search range [-64, +63.75].
TL;DR: The algorithm proposed in this paper embeds a watermark in each video object by imposing a particular relationship between some predefined pairs of quantized discrete cosine transform coefficients in the luminance blocks of pseudo-randomly selected macroblocks (MBs).
Abstract: The recent finalization of MPEG-4 will make this standard very attractive for a large range of applications such as video editing, Internet video distribution, wireless video communications. Some of these applications are likely to get great benefit from watermarking technology, since it can enable a number of innovative services, such as conditional access policies, data annotation, data labeling, content authentication, to be implemented at a low price. One of the key points of the MPEG-4 standard is the possibility to access and manipulate objects within a video sequence. Thus object watermarking has to be achieved in such a way that, while a video object is transferred from a sequence to another, it is still possible to correctly access the data embedded within the object itself. The algorithm proposed in this paper embeds a watermark in each video object by imposing a particular relationship between some predefined pairs of quantized discrete cosine transform (DCT) coefficients in the luminance blocks of pseudo-randomly selected macroblocks (MBs). Watermarks are equally embedded into intra and inter MBs. Experimental results are presented validating the effectiveness of the proposed approach.
TL;DR: In this article, the authors present a method for generating a video sequence from a plurality of video segments, identifying an inability to output at least one video segment in the video sequence in substantially real-time, and adjusting an output level associated with the video segment.
Abstract: Systems and methods for previewing edited video. In general, in one implementation, a method includes generating a video sequence from a plurality of video segments, identifying an inability to output at least one video segment in the video sequence in substantially real time; and adjusting an output level associated with the at least one video segment to enable the at least one video segment to be output in substantially real time. The output level may include a video quality or a frame rate.
TL;DR: A de-interlacing algorithm using adaptive 4-field global/local motion compensated approach is presented, which shows that the peak signal-to-noise ratio of the proposed algorithm is 2/spl sim/3 dB higher than that of previous studies and attain the best quality of subjective view.
Abstract: A de-interlacing algorithm using adaptive 4-field global/local motion compensated approach is presented. It consists of block-based directional edge interpolation, same-parity 4-field motion detection, global/local motion estimation and compensation. The edges are sharper when the directional edge interpolation is adopted. The same parity 4-field motion detection and the 4-field local motion estimation detect the static areas and fast motion by four reference fields, and the global motion estimation detects the camera panning and zooming motions. The global and local motion compensation recover the interlaced videos to the progressive ones. Experimental results show that the peak signal-to-noise ratio of our proposed algorithm is 2/spl sim/3 dB higher than that of previous studies and attain the best quality of subjective view.
TL;DR: This paper investigates the use of temporal oversampling to improve the accuracy of optical flow estimation (OFE) and demonstrates significant improvements in OFE accuracy both on synthetically generated video sequences and on a real video sequence captured using an experimental high-speed imaging system.
Abstract: Recent advances in imaging sensor technology make high frame-rate video capture practical. As demonstrated in previous work, this capability can be used to enhance the performance of many image and video processing applications. The idea is to use the high frame-rate capability to temporally oversample the scene and, thus, to obtain more accurate information about scene motion and illumination. This information is then used to improve the performance of image and standard frame-rate video applications. This paper investigates the use of temporal oversampling to improve the accuracy of optical flow estimation (OFE). A method for obtaining high accuracy optical flow estimates at a conventional standard frame rate, e.g., 30 frames/s, by first capturing and processing a high frame-rate version of the video is presented. The method uses the Lucas-Kanade algorithm to obtain optical flow estimates at a high frame rate, which are then accumulated and refined to estimate the optical flow at the desired standard frame rate. The method demonstrates significant improvements in OFE accuracy both on synthetically generated video sequences and on a real video sequence captured using an experimental high-speed imaging system. It is then shown that a key benefit of using temporal oversampling to estimate optical flow is the reduction in motion aliasing. Using sinusoidal input sequences, the reduction in motion aliasing is identified and the desired minimum sampling rate as a function of the velocity and spatial bandwidth of the scene is determined. Using both synthetic and real video sequences, it is shown that temporal oversampling improves OFE accuracy by reducing motion aliasing not only for areas with large displacements but also for areas with small displacements and high spatial frequencies. The use of other OFE algorithms with temporally oversampled video is then discussed. In particular, the Haussecker algorithm is extended to work with high frame-rate sequences. This extension demonstrates yet another important benefit of temporal oversampling, which is improving OFE accuracy when brightness varies with time.
TL;DR: In this paper, a method and apparatus for supporting scalability for motion vectors in scalable video coding is presented, which includes a motion estimation module searching for a variable block size and a motion vector that minimize a cost function for each layer according to predetermined pixel accuracy, a sampling module upsampling an original frame when the pixel accuracy is less than a pixel size.
Abstract: A method and apparatus for supporting scalability for motion vectors in scalable video coding are provided. The motion estimation apparatus (120) includes a motion estimation module (121) searching for a variable block size and a motion vector that minimize a cost function for each layer according to predetermined pixel accuracy, a sampling module upsampling an original frame when the pixel accuracy is less than a pixel size, and before searching for a motion vector in a layer having a lower resolution than the original frame downsampling the original frame into the low resolution, a motion residual module calculating a residual between motion vectors found in the respective layers, and a rearrangement module rearranging the residuals between the found motion vectors and the found variable block size information using significance obtained from a searched lower layer. Accordingly, true motion scalability can be achieved to improve adaptability to changing network circumstances.
TL;DR: A content-adaptive video preview system (S1) as discussed by the authors allows to go faster through a video than existing video skimming techniques by allowing the user to interactively adapt the speed of browsing and the abstraction level of presentation.
Abstract: A content-adaptive video preview system (100) allows to go faster through a video than existing video skimming techniques. Thereby, a user can interactively adapt (S1) the speed of browsing and/or the abstraction level of presentation. According to one embodiment of the invention, this adaptation procedure (S1) is realized by the following steps: First, differences between precalculated spatial color histograms associated with chronologically subsequent pairs of video frames said video file is composed of are calculated (S1 a). Then, these differences and/or a cumulative difference value representing the sum of these differences are compared (S1 b) to a predefined redundancy threshold (S(t)). In case differences in the color histograms of particular video frames (302 a-c) and/or said cumulative difference value exceed this redundancy threshold (S(t)), these video frames are selected (S1 c) for the preview. Intermediate video frames (304 a-d) are removed and/or inserted (S1 d) between each pair of selected chronologically subsequent video frames depending on the selected abstraction level of presentation. Thereby, said redundancy threshold value (S(t)) can be adapted (S1 b′) for changing the speed of browsing and/or the abstraction level of presentation.
TL;DR: In this paper, a method for creating an interpolated video frame using a current video frame, and a plurality of previous video frames is presented, which includes creating a set of extrapolated motion vectors from at least one reference video frame in the plurality of preceding video frames.
Abstract: A method for creating an interpolated video frame using a current video frame, and a plurality of previous video frames. The method includes creating a set of extrapolated motion vectors from at least one reference video frame in the plurality of previous video frames ; performing an adaptive motion estimation using the extrapolated motion vectors and a content type of each extrapolated motion vector ; deciding on a motion compensated interpolation mode ; and, creating a set of motion compensated motion vectors based on the motion compensated interpolation mode decision. An apparatus for performing the method is also disclosed.
TL;DR: In this paper, the temporal and/or spatial characteristics of a macroblock are analyzed in order to reduce the number of modes for which motion estimation and rate distortion efficiency calculations are to be performed.
Abstract: The temporal and/or spatial characteristics of a macroblock are analyzed in order to reduce the number of modes for which motion estimation and rate distortion efficiency calculations are to be performed. In one embodiment, macroblock mean and variance characteristics are analyzed to merge sub-blocks together within the macroblock. These merged sub-blocks may be used to identify both inter and intra modes for the macroblock.
TL;DR: In this paper, a video decoding method, a video encoder and a video encoding method are described, which includes a motion vector resolution reducer (999) and a motion compensator (960) for decoding a video bitstream for an image block.
Abstract: A video decoder, a video decoding method, a video encoder and a video encoding method are disclosed. A video decoder for decoding a video bitstream for an image block includes a motion vector resolution reducer (999) and a motion compensator (960). The motion vector resolution reducer is for receiving decoded high resolution motion vectors included in the video bitstream and for reducing an accuracy of the high resolution motion vectors to correspond to a low resolution. The motion compensator, in signal communication with the motion vector resolution reducer, is for forming a motion compensated high resolution prediction using the reduced accuracy motion vectors. The video encoder for encoding scalable video comprises a motion compensator (1190) for forming a motion compensated full resolution prediction and combining combining (1105) the motion compensated full resolution prediction from an image block to form a prediction residual. The prediction residual is downsampled (1112) to form a low resolution downsampled prediction residual and then coded (1115).
TL;DR: In this paper, a method and apparatus for image stabilization takes an input image sequence including a plurality of frames and adaptively integrates the motion vectors to produce, for each frame, a motion vector to be used for image stabilisation.
Abstract: A method and apparatus for image stabilization takes an input image sequence including a plurality of frames (23), estimates frame-level motion vectors for each frame, and adaptively integrates the motion vectors to produce, for each frame, a motion vector to be used for image stabilization (7). A copy of the reference image of a frame is displaced by the corresponding adaptively integrated motion vector (21). In one embodiment, the perimeter of the image sensor is padded with margin to be used for image compensation. In another embodiment, vertical and horizontal components are treated independently (7). In still another embodiment, the motion estimation circuitry associated with an MPEG-4 encoder is used to calculate macroblock level vectors (11), and a histogram is used co compute a corresponding frame-level vector for that frame (7).
TL;DR: In this paper, a method and an apparatus for elevating compression efficiency of a motion vector by effectively predicting the motion vector of an enhanced layer by means of the motion vectors of a base layer in a video coding method employing a multi-layer structure are disclosed.
Abstract: A method and an apparatus for elevating compression efficiency of a motion vector by effectively predicting a motion vector of an enhanced layer by means of a motion vector of a base layer in a video coding method employing a multi-layer structure are disclosed A motion vector compression apparatus includes: a down-sampling module for down-sampling an original frame to have a size of a frame in each layer; a motion vector search module for obtaining a motion vector in which an error or a cost function is minimized with respect to the down-sampled frame; a reference vector generation module for generating a reference motion vector in a predetermined enhanced layer by means of a block of a lower layer corresponding to a predetermined block in the predetermined enhanced layer, and motion vectors in blocks around the block; and a motion difference module for calculating a difference between the obtained motion vector and the reference motion vector
TL;DR: In this paper, a method and apparatus of improving the compression efficiency of a motion vector by efficiently predicting the motion vector in an enhancement layer from the motion vectors in a base layer in a video coding method using a multi-layer are provided.
Abstract: A method and apparatus of improving the compression efficiency of a motion vector by efficiently predicting a motion vector in an enhancement layer from a motion vector in a base layer in a video coding method using a multi-layer are provided The method includes obtaining a motion vector in a base layer frame having a first frame rate from an input frame, obtaining a motion vector in a first enhancement layer frame having a second frame rate from the input frame, the second frame rate being greater than the first frame rate, generating a predicted motion vector by referring to a motion vector for at least one frame among base layer frames present immediately before and after the same temporal position as the first enhancement layer frame if there is no base layer frame at the same temporal position as the first enhancement layer frame, and coding a difference between the motion vector in the first enhancement layer frame and the generated predicted motion vector, and the obtained motion vector in the base layer
TL;DR: The improved algorithm can avoid the data dependences while with the same coding performance as JM9.0, and the proposed architecture can achieve the real-time requirement for 720/spl times/576 picture size at 30 fps with the search range of 65/ spl times/65.
Abstract: In the advanced video coding standard (AVC), motion estimation adopts many new features such as variable block size searching, multiple reference frames, motion vector prediction, etc, for enhancing the coding performance. However, the high data dependence and high computation requirement of these new features makes the hardware implementation very complex, especially for real-time applications. Therefore base on the reference software JM9.0, this paper firstly improved the motion estimation algorithm from hardware-oriented viewpoint, and secondly proposed the systolic architecture of improved algorithm. It adopts 2-D systolic arrays, fully supports the AVC 's variable block size matching, and can produce 41 motion vectors for one macroblock. Experimental results show that the improved algorithm can avoid the data dependences while with the same coding performance as JM9.0, and the proposed architecture can achieve the real-time requirement for 720/spl times/576 picture size at 30 fps with the search range of 65/spl times/65.
TL;DR: In this paper, a method for processing a plurality of motion vectors is disclosed, which includes determining a number of different block sizes in the video frame; and, performing a variable block size motion vector process if the number of varying block sizes is greater than one.
Abstract: A method for processing a plurality of motion vectors is disclosed. The method includes determining a number of different block sizes in the video frame; and, performing a variable block size motion vector process if the number of different block sizes in the video frame is greater than one, the variable block size motion vector process comprising constructing a pyramid of motion vectors from the plurality of motion vectors, the pyramid having at least a first layer and a second layer of motion vectors, each of the first and second layers having a set of motion vectors based on a particular block size. An apparatus for performing the inventive method is also disclosed.
TL;DR: A new algorithm to find video clips with different temporal durations and some spatial variations by adopting a longest common sub-sequence (LCS) matching technique for measuring the temporal similarity between video clips is proposed.
Abstract: In this paper, we propose a new algorithm to find video clips with different temporal durations and some spatial variations. We adopt a longest common sub-sequence (LCS) matching technique for measuring the temporal similarity between video clips. Based on the measure we propose 3 techniques to improve the retrieval effectiveness. First, we use a few coefficients in the low frequency region of DCT block as the basis to represent spatial features. Second, we heuristically determine a suitable quantization step-size for visual features to better tolerate spatial variations of similar video clips and propose a paired quantizer method. Third, we incorporate the compactness and/or continuity of matched common sub-sequences in the LCS measure to better reflect temporal characteristics of video. The performance of the proposed algorithm shows an improvement of 63.5% in terms of MAP (mean average precision) as compared to an existing algorithm. The results show that our approach is effective for news video retrieval.
TL;DR: A new motion complexity measure is defined to express the complexity of motion contents in a video frame, and a new H.264 rate control scheme with the MC measure and perceptual bit allocation is proposed for medical video compression.
Abstract: This paper aims at applying H.264 in medical video compression applications and improving the H.264 rate control algorithm with better perceptual quality. First, H.264 is briefly reviewed and introduced to the area of medical video compression. Second, a new motion complexity (MC) measure is defined to express the complexity of motion contents in a video frame, and a new H.264 rate control scheme with the MC measure and perceptual bit allocation is proposed for medical video compression. Third, two sets of experiments are conducted: the comparison between MPEG-4 and H.264, and the comparison between JVT-H014 , which is the H.264 adopted rate control algorithm, and our proposed rate control scheme. The first set of experiments shows that compared with MPEG-4, H.264 can achieve a significant average peak signal-to-noise ratio (PSNR) gain of up to 4.35 dB for the test medical video sequences, and thus is much more effective when applied in medical video compression. The second set of experiments shows that compared with H014, the proposed rate control scheme can achieve better perceptual video quality, with an average PSNR gain of up to 0.19 dB for the test medical video sequences.
TL;DR: Experimental results reveal that the A-TDB successfully adopts the search patterns to remove the temporal redundancy of sequences with slow, moderate and rapid motion content.
Abstract: Content with rapid, moderate, and slow motion is frequently mixed together in real video sequences. Until now, no fast block-matching algorithm (FBMA), including the well-known three-step search (TSS), the block-based gradient descent search (BBGDS), and the diamond search (DS), can efficiently remove the temporal redundancy of sequences with wide range motion content. This paper proposes an adaptive FBMA, called A-TDB, to solve this problem. Based on the characteristics of a proposed predicted profit list, the A-TDB can adaptively switch search patterns among the TSS, DS, and BBGDS, according to the motion content. Experimental results reveal that the A-TDB successfully adopts the search patterns to remove the temporal redundancy of sequences with slow, moderate and rapid motion content.