TL;DR: This paper reviews methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model.
Abstract: During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, wireless, and multimedia computing systems face a series of constraints such as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at low bit rates. In response to this need, considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. This paper is organized as follows. First, psychoacoustic principles are described, with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filter bank design issues and algorithms are addressed, with a particular emphasis placed on the modified discrete cosine transform, a perfect reconstruction cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of those techniques that utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms that have become international and/or commercial standards receive in-depth treatment, including the ISO/IEC MPEG family (-1, -2, -4), the Lucent Technologies PAC/EPAC/MPAC, the Dolby AC-2/AC-3, and the Sony ATRAC/SDDS algorithms. Then, we describe subjective evaluation methodologies in some detail, including the ITU-R BS.1116 recommendation on subjective measurements of small impairments. This paper concludes with a discussion of future research directions.
TL;DR: A remotely controllable computing and multimedia entertainment system includes a personal computer (24) having an entertainment circuit (12) made up of a radio frequency circuit (48), a television circuit (46), and an audio multimedia circuit (18) as mentioned in this paper.
Abstract: A remotely controllable computing and multimedia entertainment system includes a personal computer (24) having an entertainment circuit (12) made up of a radio frequency circuit (48), a television circuit (46), and an audio multimedia circuit (18). A remote control circuit (50) provides programmable control of the entertainment circuit (12) to select among computer function operation, television and radio operation, and audio operation. An analog mixing circuit (70) within the audio multimedia circuit (18) provides mixing for a plurality of analog audio signals. A telephone circuit (44) integrates data, fax, and voice telephone signals in the entertainment circuit (12). A volume control circuit (318) within the audio multimedia circuit (18) provides varying volume, bass, and tone levels for each audio signal received by the analog mixing circuit (70). The analog audio signals received by analog mixing circuit (70) may include monaural and stereo audio signals.
TL;DR: There are whole classes of algorithms that the speech community is not interested in pursuing or using in digital signal processing of sound and these algorithms and techniques are revealed in this book.
Abstract: With the advent of `multimedia', digital signal processing (DSP) of sound has emerged from the shadow of bandwidth limited speech processing to become a research field of its own. To date, most research in DSP applied to sound has been concentrated on speech, which is bandwidth limited to about 4 kilohertz. Speech processing is also limited by the low fidelity typically expected in the telephone network. Today, the main applications of audio DSP are high quality audio coding and the digital generation and manipulation of music signals. They share common research topics including perceptual measurement techniques and analysis/synthesis methods. Additional important topics are hearing aids using signal processing technology and hardware architectures for digital signal processing of audio. In all these areas the last decade has seen a significant amount of application-oriented research. The frequency range of wideband audio has an upper limit of 20 kilohertz and the resulting difference in frequency range and Signal to Noise Ratio (SNR) due to sample size must be taken into account when designing DSP algorithms. There are whole classes of algorithms that the speech community is not interested in pursuing or using. These algorithms and techniques are revealed in this book. This book is suitable for advanced level courses and serves as a valuable reference for researchers in the field. Interested and informed engineers will also find the book useful in their work.