TL;DR: It is contended that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speech-reading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels.
Abstract: Viewing a speaker’s articulatory movements substantially improves a listener’s ability to understand spoken words, especially under noisy environmental conditions. It has been claimed that this gain is most pronounced when auditory input is weakest, an effect that has been related to a well-known principle of multisensory integration—‘‘inverse effectiveness.’’ In keeping with the predictions of this principle, the present study showed substantial gain in multisensory speech enhancement at even the lowest signal-tonoise ratios (SNRs) used (224 dB), but it was also evident that there was a ‘‘special zone’’ at a more intermediate SNR of 212 dB where multisensory integration was additionally enhanced beyond the predictions of this principle. As such, we show that inverse effectiveness does not strictly apply to the multisensory enhancements seen during audiovisual speech perception. Rather, the gain from viewing visual articulations is maximal at intermediate SNRs, well above the lowest auditory SNR where the recognition of whole words is significantly different from zero. We contend that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speechreading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels. At these intermediate levels, the extent of multisensory enhancement of speech recognition is considerable, amounting to more than a 3-fold performance improvement relative to an auditory-alone condition.
TL;DR: The data suggest that visual cues derived from the dynamic movements of the fact during speech production interact with time-aligned auditory cues to enhance sensitivity in auditory detection.
Abstract: Classic accounts of the benefits of speechreading to speech recognition treat auditory and visual channels as independent sources of information that are integrated fairly early in the speech perception process. The primary question addressed in this study was whether visible movements of the speech articulators could be used to improve the detection of speech in noise, thus demonstrating an influence of speechreading on the ability to detect, rather than recognize, speech. In the first experiment, ten normal-hearing subjects detected the presence of three known spoken sentences in noise under three conditions: auditory-only (A), auditory plus speechreading with a visually matched sentence (AV(M)) and auditory plus speechreading with a visually unmatched sentence (AV(UM). When the speechread sentence matched the target sentence, average detection thresholds improved by about 1.6 dB relative to the auditory condition. However, the amount of threshold reduction varied significantly for the three target sentences (from 0.8 to 2.2 dB). There was no difference in detection thresholds between the AV(UM) condition and the A condition. In a second experiment, the effects of visually matched orthographic stimuli on detection thresholds was examined for the same three target sentences in six subjects who participated in the earlier experiment. When the orthographic stimuli were presented just prior to each trial, average detection thresholds improved by about 0.5 dB relative to the A condition. However, unlike the AV(M) condition, the detection improvement due to orthography was not dependent on the target sentence. Analyses of correlations between area of mouth opening and acoustic envelopes derived from selected spectral regions of each sentence (corresponding to the wide-band speech, and first, second, and third formant regions) suggested that AV(M) threshold reduction may be determined by the degree of auditory-visual temporal coherence, especially between the area of lip opening and the envelope derived from mid- to high-frequency acoustic energy. Taken together, the data (for these sentences at least) suggest that visual cues derived from the dynamic movements of the fact during speech production interact with time-aligned auditory cues to enhance sensitivity in auditory detection. The amount of visual influence depends in part on the degree of correlation between acoustic envelopes and visible movement of the articulators.
TL;DR: Integration modeling results suggested that speechreading and AV integration training could be useful for some individuals, potentially providing as much as 26% improvement in AV consonant recognition.
Abstract: Factors leading to variability in auditory-visual (AV) speech recognition include the subject's ability to extract auditory (A) and visual (V) signal-related cues, the integration of A and V cues, and the use of phonological, syntactic, and semantic context. In this study, measures of A, V, and AV recognition of medial consonants in isolated nonsense syllables and of words in sentences were obtained in a group of 29 hearing-impaired subjects. The test materials were presented in a background of speech-shaped noise at 0-dB signal-to-noise ratio. Most subjects achieved substantial AV benefit for both sets of materials relative to A-alone recognition performance. However, there was considerable variability in AV speech recognition both in terms of the overall recognition score achieved and in the amount of audiovisual gain. To account for this variability, consonant confusions were analyzed in terms of phonetic features to determine the degree of redundancy between A and V sources of information. In addition, a measure of integration ability was derived for each subject using recently developed models of AV integration. The results indicated that (1) AV feature reception was determined primarily by visual place cues and auditory voicing + manner cues, (2) the ability to integrate A and V consonant cues varied significantly across subjects, with better integrators achieving more AV benefit, and (3) significant intra-modality correlations were found between consonant measures and sentence measures, with AV consonant scores accounting for approximately 54% of the variability observed for AV sentence recognition. Integration modeling results suggested that speechreading and AV integration training could be useful for some individuals, potentially providing as much as 26% improvement in AV consonant recognition.
TL;DR: It is shown that CI users present higher visuoauditory performance when compared with normally hearing subjects with similar auditory stimuli, and this better performance is not only due to greater speechreading performance, but, most importantly, also due to a greater capacity to integrate visual input with the distorted speech signal.
Abstract: The cochlear implant (CI) is a neuroprosthesis that allows profoundly deaf patients to recover speech intelligibility. This recovery goes through long-term adaptative processes to build coherent percepts from the coarse information delivered by the implant. Here we analyzed the longitudinal postimplantation evolution of word recognition in a large sample of CI users in unisensory (visual or auditory) and bisensory (visuoauditory) conditions. We found that, despite considerable recovery of auditory performance during the first year postimplantation, CI patients maintain a much higher level of word recognition in speechreading conditions compared with normally hearing subjects, even several years after implantation. Consequently, we show that CI users present higher visuoauditory performance when compared with normally hearing subjects with similar auditory stimuli. This better performance is not only due to greater speechreading performance, but, most importantly, also due to a greater capacity to integrate visual input with the distorted speech signal. Our results suggest that these behavioral changes in CI users might be mediated by a reorganization of the cortical network involved in speech recognition that favors a more specific involvement of visual areas. Furthermore, they provide crucial indications to guide the rehabilitation of CI patients by using visually oriented therapeutic strategies.
TL;DR: In this article, the authors present a series of strategies for people with hearing loss to cope with the effects of hearing loss, including counseling, psychosocial support, and assertiveness training.
Abstract: Chapter 1: Introduction PART 1: SPEECH RECOGNITION AND PERSONS WHO HAVE HEARING LOSS Chapter 2: Assessing Hearing Acuity and Speech Recognition Chapter 3: Listening Devices and Related Technology Chapter 4: Auditory Training Chapter 5: Speechreading Chapter 6: Speechreading Training PART II: CONVERSATION AND COMMUNICATION BEHAVIORS Chapter 7: Communication Strategies and Conversational Styles Chapter 8: Assessment of Conversational Fluency and Communication Difficulties Chapter 9: Communication Strategies Training Chapter 10: Counseling, Psychosocial Support, and Assertiveness Training PART III: AURAL REHABILITATION FOR ADULTS Chapter 11: Adults Who Have Hearing Loss Chapter 12: Aural Rehabilitation Plans for Adults Chapter 13: Aural Rehabilitation Plans for Older Adults PART IV: AURAL (RE)HABILITATION FOR CHILDREN Chapter 14: Infants and Toddlers Who Have Hearing Loss Chapter 15: School-Age Children Who Have Hearing Loss