TL;DR: A 4-b/sample transform coder is designed using a psychoacoustically derived noise-making threshold that is based on the short-term spectrum of the signal, and tested in a formal subjective test involving a wide selection of monophonic audio inputs.
Abstract: A 4-b/sample transform coder is designed using a psychoacoustically derived noise-making threshold that is based on the short-term spectrum of the signal. The coder has been tested in a formal subjective test involving a wide selection of monophonic audio inputs. The signals used in the test were of 15-kHz bandwidth, sampled at 32 kHz. The bit rate of the resulting coder was 128 kb/s. The subjective test shows that the coded signal could not be distinguished from the original at that bit rate. Subsequent informal work suggests that a bit rate of 96 kb/s may maintain transparency for the set of inputs used in the test. >
TL;DR: A novel technique for embedding digital "watermarks" into digital audio signals by filtering a PN-sequence with a filter that approximates the frequency masking characteristics of the human auditory system (HAS).
Abstract: In this paper, we present a novel technique for embedding digital "watermarks" into digital audio signals. Watermarking is a technique used to label digital media by hiding copyright or other information into the underlying data. The watermark must be imperceptible and should be robust to attacks and other types of distortion. In addition, the watermark also should be undetectable by all users except the author of the piece. In our method, the watermark is generated by filtering a PN-sequence with a filter that approximates the frequency masking characteristics of the human auditory system (HAS). It is then weighted in the time domain to account for temporal masking. We discuss the detection of the watermark and assess the robustness of our watermarking approach to attacks and various signal manipulations.
TL;DR: New results of masking and loudness reduction of noise are reported and the design principles of speech coding systems exploiting auditory masking are described.
Abstract: In any speech coding system that adds noise to the speech signal, the primary goal should not be to reduce the noise power as much as possible, but to make the noise inaudible or to minimize its subjective loudness. ’’Hiding’’ the noise under the signal spectrum is feasible because of human auditory masking: sounds whose spectrum falls near the masking threshold of another sound are either completely masked by the other sound or reduced in loudness. In speech coding applications, the ’’other sound’’ is, of course, the speech signal itself. In this paper we report new results of masking and loudness reduction of noise and describe the design principles of speech coding systems exploiting auditory masking.
TL;DR: In this article, the quantizing of the sample values in the sub-bands, e.g. 24 subbands, is controlled to the extent that the quantising noise levels of the individual sub-band signals are at approximately the same level difference from the masking threshold of the human auditory system resulting from the individual subsets.
Abstract: In the transmission of audio signals, the audio signal is digitally represented by use of quadrature mirror filtering in the form a plurality of spectral sub-band signals. The quantizing of the sample values in the sub-bands, e.g. 24 sub-bands, is controlled to the extent that the quantizing noise levels of the individual sub-band signals are at approximately the same level difference from the masking threshold of the human auditory system resulting from the individual sub-band signals. The differences of the quantizing noise levels of the sub-band signals with respect to the resulting masking threshold are set by the difference between the total information flow required for coding and the total information flow available for coding. The available total information flow is set and may then fluctuate as a function of the signal.
TL;DR: In this paper, the coder/decoder (codec) system of the present invention includes a coder and a decoder, which is capable of performing resolution switching, spectral weighting, digital encoding, and parametric modeling.
Abstract: The coder/decoder (codec) system of the present invention includes a coder and a decoder. The coder includes a multi-resolution transform processor, such as a modulated lapped transform (MLT) transform processor, a weighting processor, a uniform quantizer, a masking threshold spectrum processor, an entropy encoder, and a communication device, such as a multiplexor (MUX) for multiplexing (combining) signals received from the above components for transmission over a single medium. The decoder comprises inverse components of the encoder, such as an inverse multi-resolution transform processor, an inverse weighting processor, an inverse uniform quantizer, an inverse masking threshold spectrum processor, an inverse entropy encoder, and an inverse MUX. With these components, the present invention is capable of performing resolution switching, spectral weighting, digital encoding, and parametric modeling.