About: Adaptive predictive coding is a research topic. Over the lifetime, 116 publications have been published within this topic receiving 7863 citations.
TL;DR: This paper gives an exposition of linear prediction in the analysis of discrete signals as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal.
Abstract: This paper gives an exposition of linear prediction in the analysis of discrete signals The signal is modeled as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal In the frequency domain, this is equivalent to modeling the signal spectrum by a pole-zero spectrum The major part of the paper is devoted to all-pole models The model parameters are obtained by a least squares analysis in the time domain Two methods result, depending on whether the signal is assumed to be stationary or nonstationary The same results are then derived in the frequency domain The resulting spectral matching formulation allows for the modeling of selected portions of a spectrum, for arbitrary spectral shaping in the frequency domain, and for the modeling of continuous as well as discrete spectra This also leads to a discussion of the advantages and disadvantages of the least squares error criterion A spectral interpretation is given to the normalized minimum prediction error Applications of the normalized error are given, including the determination of an "optimal" number of poles The use of linear prediction in data compression is reviewed For purposes of transmission, particular attention is given to the quantization and encoding of the reflection (or partial correlation) coefficients Finally, a brief introduction to pole-zero modeling is given
TL;DR: Application of this method for efficient transmission and storage of speech signals as well as procedures for determining other speechcharacteristics, such as formant frequencies and bandwidths, the spectral envelope, and the autocorrelation function, are discussed.
Abstract: A method of representing the speech signal by time‐varying parameters relating to the shape of the vocal tract and the glottal‐excitation function is described. The speech signal is first analyzed and then synthesized by representing it as the output of a discrete linear time‐varying filter, which is excited by a suitable combination of a quasiperiodic pulse train and white noise. The output of the linear filter at any sampling instant is a linear combination of the past output samples and the input. The optimum linear combination is obtained by minimizing the mean‐squared error between the actual values of the speech samples and their predicted values based on a fixed number of preceding samples. A 10th‐order linear predictor was found to represent the speech signal band‐limited to 5kHz with sufficient accuracy. The 10 coefficients of the predictor are shown to determine both the frequencies and bandwidths of the formants. Two parameters relating to the glottal‐excitation function and the pitch period are determined from the prediction error signal. Speech samples synthesized by this method will be demonstrated.
TL;DR: Improved speech quality is obtained by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and by effective masking of the quantizer noise by the speech signal.
Abstract: Predictive coding methods attempt to minimize the rms error in the coded signal. However, the human ear does not perceive signal distortion on the basis of rms error, regardless of its spectral shape relative to the signal spectrum. In designing a coder for speech signals, it is necessary to consider the spectrum of the quantization noise and its relation to the speech spectrum. The theory of auditory masking suggests that noise in the formant regions would be partially or totally masked by the speech signal. Thus, a large part of the perceived noise in a coder comes from frequency regions where the signal level is low. In this paper, methods for reducing the subjective distortion in predictive coders for speech signals are described and evaluated. Improved speech quality is obtained: 1) by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and 2) by effective masking of the quantizer noise by the speech signal.
TL;DR: Preliminary studies suggest that the binary difference signal and the predictor parameters together can be transmitted at approximately 10 kilobits/second which is several times less than the bit rate required for log-PCM encoding with comparable speech quality.
Abstract: We describe in this paper a method for efficient encoding of speech signals, based on predictive coding. In this coding method, both the transmitter and the receiver estimate the signal's current value by linear prediction on the previously transmitted signal. The difference between this estimate and the true value of the signal is quantized, coded and transmitted to the receiver. At the receiver, the decoded difference signal is added to the predicted signal to reproduce the input speech signal. Because of the nonstationary nature of the speech signals, an adaptive linear predictor is used, which is readjusted periodically to minimize the mean-square error between the predicted and the true value of the signals. The predictive coding system was simulated on a digital computer. The predictor parameters, comprising one delay and nine other coefficients related to the signal spectrum, were readjusted every 5 milliseconds. The speech signal was sampled at a rate of 6.67 kHz, and the difference signal was quantized by a two-level quantizer with variable step size. Subjective comparisons with speech from a logarithmic PCM encoder (log-PCM) indicate that the quality of the synthesized speech signal from the predictive coding system is approximately equal to that of log-PCM speech encoded at 6 bits/sample. Preliminary studies suggest that the binary difference signal and the predictor parameters together can be transmitted at approximately 10 kilobits/second which is several times less than the bit rate required for log-PCM encoding with comparable speech quality.
TL;DR: The design and analysis of adaptive predictors for differential encoders employing adaptive quantization and adaptive prediction constitute one of the most promising approaches to achieving design objectives of high-quality highly intelligible speech at 6 to 16 kb/s.
Abstract: The design of speech coders that produce high-quality highly intelligible speech at 6 to 16 kb/s while retaining robustness to background and transmission impairments is an area of current research interest Differential encoding structures employing adaptive quantization and adaptive prediction constitute one of the most promising approaches to achieving these design objectives This paper focuses on the design and analysis of adaptive predictors for differential encoders Several differential encoding systems, including adaptive predictive coding, differential pulse-code modulation, noise feedback coding, direct feedback coding, and prediction error coding, are described and related Adaptive quantizers are briefly discussed and quantitative and qualitative indicators of speech coder performance are defined The channel model, the speech model, and the research problem statements used in the design of differential encoders and adaptive predictors are presented The nomenclature and theory of forward and backward adaptive prediction are developed, and several new backward adaptive algorithms based on various assumptions are presented A detailed survey of theoretical and simulation results on adaptive prediction for speech differential encoders is given, and the effects of background and transmission impairments on these systems are discussed, Finally, the impact of adaptive predictors on rate distortion theory motivated coders is indicated Numerous areas for future research are highlighted