Top 311 papers published in the topic of Speech processing in 1993

Showing papers on "Speech processing published in 1993"

Book•

Fundamentals of speech recognition

[...]

1 Jan 1993

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

...read moreread less

Abstract: 1. Fundamentals of Speech Recognition. 2. The Speech Signal: Production, Perception, and Acoustic-Phonetic Characterization. 3. Signal Processing and Analysis Methods for Speech Recognition. 4. Pattern Comparison Techniques. 5. Speech Recognition System Design and Implementation Issues. 6. Theory and Implementation of Hidden Markov Models. 7. Speech Recognition Based on Connected Word Models. 8. Large Vocabulary Continuous Speech Recognition. 9. Task-Oriented Applications of Automatic Speech Recognition.

...read moreread less

9,412 citations

Book•

Discrete-Time Processing of Speech Signals

[...]

J. R. Deller, John G. Proakis, John H. L. Hansen

1 Mar 1993

TL;DR: The preface to the IEEE Edition explains the background to speech production, coding, and quality assessment and introduces the Hidden Markov Model, the Artificial Neural Network, and Speech Enhancement.

...read moreread less

Abstract: Preface to the IEEE Edition. Preface. Acronyms and Abbreviations. SIGNAL PROCESSING BACKGROUND. Propaedeutic. SPEECH PRODUCTION AND MODELLING. Fundamentals of Speech Science. Modeling Speech Production. ANALYSIS TECHNIQUES. Short--Term Processing of Speech. Linear Prediction Analysis. Cepstral Analysis. CODING, ENHANCEMENT AND QUALITY ASSESSMENT. Speech Coding and Synthesis. Speech Enhancement. Speech Quality Assessment. RECOGNITION. The Speech Recognition Problem. Dynamic Time Warping. The Hidden Markov Model. Language Modeling. The Artificial Neural Network. Index.

...read moreread less

3,150 citations

Journal Article•10.1016/0167-6393(93)90095-3•

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

[...]

Andrew Varga, Herman J. M. Steeneken

01 Jul 1993-Speech Communication

TL;DR: NoISEX-92 specifies a carefully controlled experiment on artificially noisy speech data, examining performance for a limited digit recognition task but with a relatively wide range of noises and signal-to-noise ratios.

...read moreread less

2,248 citations

Journal Article•10.1109/5.237532•

Signal modeling techniques in speech recognition

[...]

Joseph Picone¹•Institutions (1)

Texas Instruments¹

1 Sep 1993

TL;DR: A tutorial on signal processing in state-of-the-art speech recognition systems is presented, reviewing those techniques most commonly used, and three important trends that have developed in the last five years in speech recognition are examined.

...read moreread less

Abstract: A tutorial on signal processing in state-of-the-art speech recognition systems is presented, reviewing those techniques most commonly used. The four basic operations of signal modeling, i.e. spectral shaping, spectral analysis, parametric transformation, and statistical modeling, are discussed. Three important trends that have developed in the last five years in speech recognition are examined. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or time-derivative, spectral information, have become common. Second, similarity transform techniques, often used to normalize and decorrelate parameters in some computationally inexpensive way, have become popular. Third, the signal parameter estimation problem has merged with the speech recognition process so that more sophisticated statistical models of the signal's spectrum can be estimated in a closed-loop manner. The signal processing components of these algorithms are reviewed. >

...read moreread less

864 citations

Journal Article•10.1016/0926-6410(93)90026-2•

Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations.

[...]

Angela D. Friederici¹, Erdmut Pfeifer¹, Anja Hahne¹•Institutions (1)

Max Planck Society¹

01 Oct 1993-Cognitive Brain Research

TL;DR: The present data demonstrate that linguistic errors of different categories evoke different ERP patterns, and indicate that with using connected speech as input, different aspects of language comprehension processes cannot only be described with respect to their temporal structure, but eventually also withrespect to possible brain systems subserving these processes.

...read moreread less

828 citations

Book•

Connectionist speech recognition

[...]

Hervé Bourlard

31 Oct 1993

TL;DR: Speech Reference EPFL-CONF-82487 describes the “politics of language” in the developing world and some of the challenges faced by speech interpreters and interpreters in the rapidly changing environment.

...read moreread less

Abstract: Keywords: speech Reference EPFL-CONF-82487 Record created on 2006-03-10, modified on 2017-05-10

...read moreread less

591 citations

Proceedings Article•10.1109/ICASSP.1993.319366•

An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech

[...]

Werner Verhelst, M. Roelands

27 Apr 1993

TL;DR: The resulting WSOLA (waveform-similarity-based synchronized overlap-add) algorithm produces high-quality speech output, is algorithmically and computationally efficient and robust, and allows for online processing with arbitrary time-scaling factors.

...read moreread less

Abstract: A concept of waveform similarity for tackling the problem of time-scale modification of speech is proposed. It is worked out in the context of short-time Fourier transform representations. The resulting WSOLA (waveform-similarity-based synchronized overlap-add) algorithm produces high-quality speech output, is algorithmically and computationally efficient and robust, and allows for online processing with arbitrary time-scaling factors that may be specified in a time-varying fashion and can be chosen over a wide continuous range of values. >

...read moreread less

501 citations

Journal Article•10.1044/JSHR.3606.1276•

Temporal Factors and Speech Recognition Performance in Young and Elderly Listeners

[...]

Sandra Gordon-Salant¹, Peter J. Fitzgibbons²•Institutions (2)

University of Maryland, College Park¹, University of Washington²

01 Dec 1993-Journal of Speech Language and Hearing Research

TL;DR: The overall conclusion is that age-related factors other than peripheral hearing loss contribute to diminished speech recognition performance of elderly listeners.

...read moreread less

Abstract: This study investigated factors that contribute to deficits of elderly listeners in recognizing speech that is degraded by temporal waveform distortion. Young and elderly listeners with normal hear...

...read moreread less

498 citations

Book•

Biomedical Digital Signal Processing

[...]

Willis J. Tompkins

1 Jan 1993

457 citations

Journal Article•10.1044/JSHR.3602.254•

A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals.

[...]

Guus de Krom¹•Institutions (1)

Utrecht University¹

01 Apr 1993-Journal of Speech Language and Hearing Research

TL;DR: A new method to calculate a spectral harmonics-to-noise ratio (HNR) in speech signals is presented and involves discrimination between harmonic and noise energy in the magnitude spectrum by discriminating between them.

...read moreread less

Abstract: A new method to calculate a spectral harmonics-to-noise ratio (HNR) in speech signals is presented. The method involves discrimination between harmonic and noise energy in the magnitude spectrum by...

...read moreread less

331 citations

Proceedings Article•10.1109/ICASSP.1993.319236•

Recognition of speech in additive and convolutional noise based on RASTA spectral processing

[...]

Hynek Hermansky, Nelson Morgan¹, Hans-Günter Hirsch¹•Institutions (1)

International Computer Science Institute¹

27 Apr 1993

TL;DR: Experiments with a recognizer trained on clean speech and test data degraded by both convolutional and additive noise show that doing RASTA processing in the new domain yields results comparable with those obtained by training the recognizer on known noise.

...read moreread less

Abstract: RASTA (relative spectral) processing is studied in a spectral domain which is linear-like for small spectral values and logarithmic-like for large spectral values. Experiments with a recognizer trained on clean speech and test data degraded by both convolutional and additive noise show that doing RASTA processing in the new domain yields results comparable with those obtained by training the recognizer on known noise. >

...read moreread less

Journal Article•10.1016/0004-3702(93)90020-C•

Pitch accent in context: predicting intonational prominence from text

[...]

Julia Hirschberg¹•Institutions (1)

Bell Labs¹

01 Oct 1993-Artificial Intelligence

TL;DR: A series of recent experiments on corpora of recorded (read) speech and spontaneous (elicited) speech suggest that it is indeed possible to model human accent strategies with fair success for unrestricted text—with only the tools for automatic text analysis currently available.

...read moreread less

Patent•10.1121/1.423162•

Speech recognition interface system suitable for window systems and speech mail systems

[...]

Hideki Hashimoto¹, Yoshifumi Nagata¹, Shigenobu Seto¹, Yoichi Takebayashi¹, Hideaki Shinchi¹, Koji Yamaguchi¹ - Show less +2 more•Institutions (1)

Toshiba¹

28 Dec 1993-Journal of the Acoustical Society of America

TL;DR: A speech recognition interface system capable of handling a plurality of application programs simultaneously, and realizing convenient speech input and output modes which are suitable for the applications in the window systems and the speech mail systems, is presented in this article.

...read moreread less

Abstract: A speech recognition interface system capable of handling a plurality of application programs simultaneously, and realizing convenient speech input and output modes which are suitable for the applications in the window systems and the speech mail systems. The system includes a speech recognition unit for carrying out a speech recognition processing for a speech input made by a user to obtain a recognition result; a program management table for managing program management data indicating a speech recognition interface function required by each application program; and a message processing unit for exchanging messages with the plurality of application programs in order to specify an appropriate recognition vocabulary to be used in the speech recognition processing of the speech input to the speech recognition unit, and to transmit the recognition result for the speech input obtained by the speech recognition unit by using the appropriate recognition vocabulary to appropriate ones of the plurality of application programs, according to the program management data managed by the program management table.

...read moreread less

Patent•10.1121/1.419616•

Reconstruction of wideband speech from narrowband speech using codebooks

[...]

Masanobu Abe¹, Yuki Yoshida¹•Institutions (1)

Nippon Telegraph and Telephone¹

29 Sep 1993-Journal of the Acoustical Society of America

TL;DR: In this article, a wideband speech signal (8 kHz) of high quantity is reconstructed from a narrowband speech signals (300 Hz to 3.4 kHz) by LPC-analyzing to obtain spectrum information parameters.

...read moreread less

Abstract: A wideband speech signal (8 kHz, for example) of high quantity is reconstructed from a narrowband speech signal (300 Hz to 3.4 kHz). The input narrowband speech signal is LPC-analyzed to obtain spectrum information parameters, and the parameters are vector-quantized using a narrowband speech signal codebook. For each code number of the narrowband speech signal codebook, the wideband speech waveform corresponding to the codevector concerned is extracted by one pitch for voiced speech and by one frame for unvoiced speech and prestored in a representative waveform codebook. Representative waveform segments corresponding to the respective output codevector numbers of the quantizer are extracted from the representative waveform codebook. Voiced speech is synthesized by pitch-synchronous overlapping of the extracted representative waveform segments and unvoiced speech is synthesized by randomly using waveforms of one frame length. By this, a wideband speech signal is produced. Then, frequency components below 300 Hz and above 3.4 kHz are extracted from the wideband speech signal and are added to an up-sampled version of the input narrowband speech signal to thereby reconstruct the wideband speech signal.

...read moreread less

Journal Article•10.1016/0167-6393(93)90093-Z•

Cepstral parameter compensation for HMM recognition in noise

[...]

Mark J. F. Gales¹, Steve Young¹•Institutions (1)

University of Cambridge¹

01 Jul 1993-Speech Communication

TL;DR: The PMC technique is based on parallel model combination in which the parameters of corresponding pairs of speech and noise states are combined to yield a set of compensated parameters, which improves on earlier cepstral mean compensation methods in that it also adapts the variances and as a result can deal with much lower SNRs.

...read moreread less

Patent•

Message recognition employing integrated speech and handwriting information

[...]

Jerome R. Bellegarda¹, Dimitri Kanevsky¹•Institutions (1)

IBM¹

7 Jun 1993

TL;DR: In this article, the authors present a method of, and apparatus for, operating an automatic message recognition system, in which the following steps are executed: a user's speech is converted to a first signal; a users handwriting is converted into a second signal; and the first signal and the second signal are processed to decode a consistent message, conveyed separately by the first signals and by the second signals.

...read moreread less

Abstract: A method of, and apparatus for, operating an automatic message recognition system. In accordance with the method the following steps are executed: a user's speech is converted to a first signal; a user's handwriting is converted to a second signal; and the first signal and the second signal are processed to decode a consistent message, conveyed separately by the first signal and by the second signal, or conveyed jointly by the first signal and the second signal. The step of processing includes the steps of converting the first signal into a plurality of first multi-dimensional vectors and converting the second signal into a plurality of second multi-dimensional vectors. For a system employing a combined use of speech and handwriting the step of processing includes a further step of combining individual ones of the plurality of first multi-dimensional vectors and individual ones of the plurality of second multi-dimensional vectors to form a plurality of third multi-dimensional vectors. The multi-dimensional vectors are employed to train a single set of word models, for joint use of speech and handwriting, or two sets of word models, for sequentially employed or merged speech and handwriting.

...read moreread less

Proceedings Article•10.1145/169059.169150•

VoiceNotes: a speech interface for a hand-held voice notetaker

[...]

Lisa J. Stifelman¹, Barry Arons², Chris Schmandt², Eric A. Hulteen¹•Institutions (2)

Apple Inc.¹, Massachusetts Institute of Technology²

1 May 1993

TL;DR: VoiceNotes explores the problem of capturing and retrieving spontaneous ideas, the use of speech as data, and theUse of speech input and output in the user interface for a hand-held computer without a visual display.

...read moreread less

Abstract: VoiceNotes is an application for a voice-controlled hand-held computer that allows the creation, management, and retrieval of user-authored voice notes—small segments of digitized speech containing thoughts, ideas, reminders, or things to do. Iterative design and user testing helped to refine the initial user interface design. VoiceNotes explores the problem of capturing and retrieving spontaneous ideas, the use of speech as data, and the use of speech input and output in the user interface for a hand-held computer without a visual display. In addition, VoiceNotes serves as a step toward new uses of voice technology and interfaces for future portable devices.

...read moreread less

Journal Article•10.1109/35.256873•

Speech synthesis in telecommunications

[...]

Stephen E. Levinson¹, J.P. Olive¹, J.S. Tschirgi²•Institutions (2)

Bell Labs¹, Alcatel-Lucent²

01 Nov 1993-IEEE Communications Magazine

TL;DR: A text-to-speech synthesis system that synthesizes speech from unrestricted text is discussed, and the text analysis system, which includes text preprocessing, phrasing and intonation, and letter- to-phoneme conversion, is described.

...read moreread less

Abstract: A text-to-speech synthesis system that synthesizes speech from unrestricted text is discussed. The text analysis system, which includes text preprocessing, phrasing and intonation, and letter-to-phoneme conversion, is described. The analyzed text is represented by phonetic characters, stress values, minor- and major-phrase markers, and intonational descriptors. The synthesizer uses this information to compute a speech signal in several stages. The duration of the different speech events is computed, and the intonational descriptors are converted to a fundamental frequency contour. Loudness control is also generated. After these prosodic parameters have been computed, the synthesis parameters that describe the different sounds or phonemes are generated. These parameters are converted to speech by a waveform synthesizer. >

...read moreread less

Journal Article•10.1109/89.242484•

Encoding speech using prototype waveforms

[...]

Willem Bastiaan Kleijn¹•Institutions (1)

Bell Labs¹

01 Oct 1993-IEEE Transactions on Speech and Audio Processing

TL;DR: The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals and excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s.

...read moreread less

Abstract: Voiced speech is interpreted as a concentration of slowly evolving pitch-cycle waveforms. This signal can be reconstructed by interpolation from a downsampled sequence of pitch-cycle waveforms with a rate of one prototype waveform per 20-30 ms interval. The prototype waveform is described by a set of linear-prediction (LP) filter coefficients describing the formant structure and a prototype excitation waveform, quantized with analysis-by-synthesis procedures. The speech signal is reconstructed by filtering an excitation signal consisting of the concatenation of (infinitesimal) sections of the instantaneous excitation waveforms. To obtain the correct level of periodicity, the short-term and the long-term correlations between the instantaneous excitation waveforms can be controlled explicitly. Thus, distortions such as noise, reverberation, and buzziness can be prevented. The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals. Excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s. >

...read moreread less

Book Chapter•10.1016/S0065-2407(08)60298-0•

Music and Speech Processing in the First Year of Life

[...]

Sandra E. Trehub¹, Laurel J. Trainor², Anna M. Unyk¹•Institutions (2)

University of Toronto¹, McMaster University²

01 Jan 1993-Advances in Child Development and Behavior

TL;DR: In drawing parallels between speech and music, the chapter focuses on two principal issues: the input provided by caregivers for their infants and the processing of such input by infant listeners.

...read moreread less

Abstract: Publisher Summary This chapter focuses on potential similarities between speech and music from the perspective of infant listeners. The stimuli of concern are sound sequences rather than single sounds, despite the predominant research focusing on the latter class of stimuli. The exclusion of single sounds can be justified on a number of grounds. First, several comprehensive reviews of infants' ability to perceive single speech and non-speech sounds are available. Second, evidence indicates that global patterns of speech are more salient in the pre-linguistic period than are individual speech segments. In the non-speech domain, evidence also indicates that infants proceed from global processing of auditory patterns to local processing of pattern details. In drawing parallels between speech and music, the chapter focuses on two principal issues: the input provided by caregivers for their infants and the processing of such input by infant listeners. Much of the work to be reported, particularly in the musical domain, is relatively recent. As a result, the exposition is tentative rather than definitive, its purpose being to suggest new avenues for future research and thinking.

...read moreread less

Book•

Signal processing of speech

[...]

Frank J. Owens

1 Jan 1993

TL;DR: This is an introduction and brief overview of the techniques and algorithms used in the design of speech systems and the roles of other disciplines such as electronics, computing science, linguistics and physiology are described.

...read moreread less

Abstract: This is an introduction and brief overview of the techniques and algorithms used in the design of speech systems. The author focuses on signal processing, and briefly describes the roles of other disciplines such as electronics, computing science, linguistics and physiology.

...read moreread less

Patent•10.1121/1.419826•

Three dimensional speech synthesis

[...]

Daniel Joseph Moore¹, Peter William Farrett¹•Institutions (1)

IBM¹

04 Jun 1993-Journal of the Acoustical Society of America

TL;DR: The analog signals to the right and left channels are altered according to position data stored with the text string so that the synthesized voice appears to originate at the apparent spatial position when the analog signals are sent to a speaker system.

...read moreread less

Abstract: Method, product and system alters audio data for a synthesized voice so that when it is produced on a speaker system, it appears to emanate from a spatial position. First, the voice is synthesized into a speech waveform from a set of stored data representative of a text string using standard techniques. The speech waveform is converted into analog signals for a right and left channel. According to the invention, the analog signals to the right and left channels are altered according to position data stored with the text string so that the synthesized voice appears to originate at the apparent spatial position when the analog signals are sent to a speaker system.

...read moreread less

Patent•10.1121/1.423018•

Enhancement of speech coding in background noise for low-rate speech coder

[...]

Yu-Jih Liu¹•Institutions (1)

Wilmington University¹

12 May 1993-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech coding system employs measurements of robust features of speech frames whose distribution is not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment.

...read moreread less

Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the "noisy" vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end. The results are better spectral reproduction and significant intelligibility enhancement over prior coding approaches. Robust features found to allow robust voicing decisions include: low-band energy; zero-crossing counts adapted for noise level; AMDF ratio (speech periodicity) measure; low-pass filtered backward correlation; low-pass filtered forward correlation; inverse-filtered backward correlation; and inverse-filtered pitch prediction gain measure.

...read moreread less

Journal Article•10.1016/0167-6393(93)90019-H•

Robust signal selection for linear prediction analysis of voiced speech

[...]

Changxue Ma, Y. Kamp, Leonardus Franciscus Willems

01 Mar 1993-Speech Communication

TL;DR: In this method, speech samples are selectively weighted based on how well they match the speech production model, and the estimates of the LPC coefficients obtained are more accurate and less sensitive to the values of the fundamental frequency than conventional LPC.

...read moreread less

Patent•10.1121/1.419396•

Method and system for location-specific speech recognition

[...]

Paul S. Cohen¹, John M. Lucassen¹, Roger M. Miller¹, Elton B. Sherwin¹•Institutions (1)

IBM¹

30 Dec 1993-Journal of the Acoustical Society of America

TL;DR: A method and system for reducing perplexity in a speech recognition system based upon determined geographic location.

...read moreread less

Abstract: A method and system for reducing perplexity in a speech recognition system based upon determined geographic location. In a mobile speech recognition system which processes input frames of speech against stored templates representing speech, a core library of speech templates is created and stored representing a basic vocabulary of speech. Multiple location-specific libraries of speech templates are also created and stored, each library containing speech templates representing a specialized vocabulary for a specific geographic location. The geographic location of the mobile speech recognition system is then periodically determined utilizing a cellular telephone system, a geopositioning satellite system or other similar systems and a particular one of the location-specific libraries of speech templates is identified for the current location of the system. Input frames of speech are then processed against the combination of the core library and the particular location-specific library to greatly enhance the accuracy and efficiency of speech recognition by the system. Each location-specific library preferably includes speech templates representative of location place names, proper names, and business establishments within a specific geographic location.

...read moreread less

Journal Article•10.1109/78.258104•

Adapted local trigonometric transforms and speech processing

[...]

E. Wesfreid¹, Mladen Victor Wickerhauser²•Institutions (2)

CEREMADE¹, Washington University in St. Louis²

01 Dec 1993-IEEE Transactions on Signal Processing

TL;DR: This decomposition provides a method of parameter simplification which appears to be useful for detecting fundamental frequencies, and characterizing formants.

...read moreread less

Abstract: Uses an algorithm based on the adapted-window Malvar transform to decompose digitized speech signals into a local time-frequency representation. The authors present some applications and experimental results for a signal compression and automatic voiced-unvoiced segmentation. This decomposition provides a method of parameter simplification which appears to be useful for detecting fundamental frequencies, and characterizing formants. >

...read moreread less

Proceedings Article•10.1109/SCFT.1993.762326•

Performance of noise excitation for unvoiced speech

[...]

Gernot Kubin¹, B.S. Atal, W.B. Kleijn•Institutions (1)

Bell Labs¹

13 Oct 1993

TL;DR: This paper addresses the question what perceptual quality can be achieved for unvoiced speech by a linear model with white noise excitation and demonstrates that this linear model results in unvoicing speech of high perceptual quality.

...read moreread less

Abstract: Recent interest in nonlinear modeling of speech has brought up the need to re-assess the performance limitations of linear speech models. While nonlinearity is essential in the production mechanism of speech, it need not be reflected in a speech-signal model. This paper addresses the question what perceptual quality can be achieved for unvoiced speech by a linear model with white noise excitation. Formal MOS test results demonstrate that this linear model results in unvoiced speech of high perceptual quality.

...read moreread less

Journal Article•10.1121/1.407223•

Psychophysical and speech perception studies: A case report on a binaural cochlear implant subject

[...]

R. J. M. van Hoesel¹, Yit C. Tong, R. D. Hollow, Graeme M. Clark•Institutions (1)

University of Melbourne¹

01 Dec 1993-Journal of the Acoustical Society of America

TL;DR: Further improvements in speech perception for cochlear implant patients in quiet and in noise should be possible with speech processing strategies using binaural implants, for this reason, a series of initial psychophysical and speech perception studies on the authors' first bINAural co-lear implant patient is presented.

...read moreread less

Abstract: Further improvements in speech perception for cochlear implant patients in quiet and in noise should be possible with speech processing strategies using binaural implants. For this reason, presented here is a series of initial psychophysical and speech perception studies on the authors' first binaural cochlear implant patient. For an approximate matching of the places of stimulation on the two sides, the patient usually reported a single percept when the two sides were simultaneously stimulated. Lateralization was strongly influenced by amplitude differences between the electrical stimuli on the two sides, but only weakly by interaural time delays. Speech testing, comparing monaural with binaural electrical stimulation, showed a binaural advantage particularly in noise.

...read moreread less

Book•

Visual representations of speech signals

[...]

Martin Cooke¹, S.W. Beet¹, Malcolm Crawford¹•Institutions (1)

University of Sheffield¹

4 Jun 1993

TL;DR: Advanced Time-Frequency Representations for Speech Processing Auditory-Based Wavelet Representation Distortion Maps for Speech Analysis Phase Representations of Acoustic Speech Waveforms Speech Analysis Using Higher Order Statistics Group Delay Processing of Speech Signals Contributors.

...read moreread less

Abstract: Advanced Time-Frequency Representations for Speech Processing Auditory-Based Wavelet Representation Distortion Maps for Speech Analysis Phase Representations of Acoustic Speech Waveforms Speech Analysis Using Higher Order Statistics Group Delay Processing of Speech Signals Contributors The Sheffield Signals Index.

...read moreread less

Patent•10.1121/1.419665•

Speech processing system and method for enhancing a speech signal in a noisy environment

[...]

Sangil Park¹, Ed F. Martinez¹, Dae-Hee Youn¹•Institutions (1)

Motorola¹

30 Apr 1993-Journal of the Acoustical Society of America

TL;DR: In this paper, an adaptive filter such as a finite impulse response (FIR) filter receives a digital accelerometer input signal, adjusts filter coefficients according to an estimation error signal, and provides an enhanced speech signal as an output.

...read moreread less

Abstract: A speech processing system (30) operates in a noisy environment (20) by performing adaptive prediction between inputs from two sensors positioned to transduce speech from a speaker, such as an accelerometer and a microphone. An adaptive filter (37) such as a finite impulse response (FIR) filter receives a digital accelerometer input signal, adjusts filter coefficients according to an estimation error signal, and provides an enhanced speech signal as an output. The estimation error signal is a difference between a digital microphone input signal and the enhanced speech signal. In one embodiment, the adaptive filter (37) selects a maximum one of a first predicted speech signal based on a relatively-large smoothing parameter and a second predicted speech signal based on a relatively-small smoothing parameter, with which to normalize a predicted signal power. The predicted signal power is then used to adapt the filter coefficients.

...read moreread less

...

Expand