TL;DR: It is found that ensemble fMRI signals in early visual areas could reliably predict on individual trials which of eight stimulus orientations the subject was seeing, when subjects had to attend to one of two overlapping orthogonal gratings.
Abstract: The potential for human neuroimaging to read out the detailed contents of a person's mental state has yet to be fully explored. We investigated whether the perception of edge orientation, a fundamental visual feature, can be decoded from human brain activity measured with functional magnetic resonance imaging (fMRI). Using statistical algorithms to classify brain states, we found that ensemble fMRI signals in early visual areas could reliably predict on individual trials which of eight stimulus orientations the subject was seeing. Moreover, when subjects had to attend to one of two overlapping orthogonal gratings, feature-based attention strongly biased ensemble activity toward the attended orientation. These results demonstrate that fMRI activity patterns in early visual areas, including primary visual cortex (V1), contain detailed orientation information that can reliably predict subjective perception. Our approach provides a framework for the readout of fine-tuned representations in the human brain and their subjective contents.
TL;DR: A synthetic overview of the rich literature on MT is attempted with the goal of answering the question, What does MT do?
Abstract: The small visual area known as MT or V5 has played a major role in our understanding of the primate cerebral cortex. This area has been historically important in the concept of cortical processing streams and the idea that different visual areas constitute highly specialized representations of visual information. MT has also proven to be a fertile culture dish--full of direction- and disparity-selective neurons--exploited by many labs to study the neural circuits underlying computations of motion and depth and to examine the relationship between neural activity and perception. Here we attempt a synthetic overview of the rich literature on MT with the goal of answering the question, What does MT do?
TL;DR: It is suggested that the transition toward access to consciousness relates to the optional triggering of a late wave of activation that spreads through a distributed network of cortical association areas.
Abstract: In the phenomenon of attentional blink, identical visual stimuli are sometimes fully perceived and sometimes not detected at all. This phenomenon thus provides an optimal situation to study the fate of stimuli not consciously perceived and the differences between conscious and nonconscious processing. We correlated behavioral visibility ratings and recordings of event-related potentials to study the temporal dynamics of access to consciousness. Intact early potentials (P1 and N1) were evoked by unseen words, suggesting that these brain events are not the primary correlates of conscious perception. However, we observed a rapid divergence around 270 ms, after which several brain events were evoked solely by seen words. Thus, we suggest that the transition toward access to consciousness relates to the optional triggering of a late wave of activation that spreads through a distributed network of cortical association areas.
TL;DR: The observed auditory-visual data support the view that there exist abstract internal representations that constrain the analysis of subsequent speech inputs, and provide evidence for the existence of an "analysis-by-synthesis" mechanism in auditory- visual speech perception.
Abstract: Synchronous presentation of stimuli to the auditory and visual systems can modify the formation of a percept in either modality. For example, perception of auditory speech is improved when the speaker's facial articulatory movements are visible. Neural convergence onto multisensory sites exhibiting supra-additivity has been proposed as the principal mechanism for integration. Recent findings, however, have suggested that putative sensory-specific cortices are responsive to inputs presented through a different modality. Consequently, when and where audiovisual representations emerge remain unsettled. In combined psychophysical and electroencephalography experiments we show that visual speech speeds up the cortical processing of auditory signals early (within 100 ms of signal onset). The auditory–visual interaction is reflected as an articulator-specific temporal facilitation (as well as a nonspecific amplitude reduction). The latency facilitation systematically depends on the degree to which the visual signal predicts possible auditory targets. The observed auditory–visual data support the view that there exist abstract internal representations that constrain the analysis of subsequent speech inputs. This is evidence for the existence of an “analysis-by-synthesis” mechanism in auditory–visual speech perception.
TL;DR: The experiments reported in this article explore the automaticity of VSL in several ways, using both explicit familiarity and implicit response-time measures, and fuel the conclusion that VSL both is and is not automatic.
Abstract: The visual environment contains massive amounts of information involving the relations between objects in space and time, and recent studies of visual statistical learning (VSL) have suggested that this information can be automatically extracted by the visual system. The experiments reported in this article explore the automaticity of VSL in several ways, using both explicit familiarity and implicit response-time measures. The results demonstrate that (a) the input to VSL is gated by selective attention, (b) VSL is nevertheless an implicit process because it operates during a cover task and without awareness of the underlying statistical patterns, and (c) VSL constructs abstracted representations that are then invariant to changes in extraneous surface features. These results fuel the conclusion that VSL both is and is not automatic: It requires attention to select the relevant population of stimuli, but the resulting learning then occurs without intent or awareness.
TL;DR: The authors summarizes the current state of research on change blindness and suggests future directions that promise to improve our understanding of scene perception and visual memory. But they do not address the problem of change blindness in visual perception.
Abstract: People often fail to notice large changes to visual scenes, a phenomenon now known as change blindness. The extent of change blindness in visual perception suggests limits on our capacity to encode, retain, and compare visual information from one glance to the next; our awareness of our visual surroundings is far more sparse than most people intuitively believe. These failures of awareness and the erroneous intuitions that often accompany them have both theoretical and practical ramifications. This article briefly summarizes the current state of research on change blindness and suggests future directions that promise to improve our understanding of scene perception and visual memory.
TL;DR: Research is progressing with the goals of defining a single “standard model” for each stage of the visual pathway and testing the predictive power of these models on the responses to movies of natural scenes, which would be an invaluable guide for understanding the underlying biophysical and anatomical mechanisms and relating neural responses to visual perception.
Abstract: We can claim that we know what the visual system does once we can predict neural responses to arbitrary stimuli, including those seen in nature. In the early visual system, models based on one or more linear receptive fields hold promise to achieve this goal as long as the models include nonlinear mechanisms that control responsiveness, based on stimulus context and history, and take into account the nonlinearity of spike generation. These linear and nonlinear mechanisms might be the only essential determinants of the response, or alternatively, there may be additional fundamental determinants yet to be identified. Research is progressing with the goals of defining a single "standard model" for each stage of the visual pathway and testing the predictive power of these models on the responses to movies of natural scenes. These predictive models represent, at a given stage of the visual pathway, a compact description of visual computation. They would be an invaluable guide for understanding the underlying biophysical and anatomical mechanisms and relating neural responses to visual perception.
TL;DR: It is suggested that abnormalities in the superior temporal sulcus (STS) may provide a neural basis for the range of motion-processing deficits observed in ASD, including biological motion perception.
TL;DR: NTVA provides a mathematical framework to unify the 2 fields of research--formulas bridging cognition and neurophysiology.
Abstract: A neural theory of visual attention (NTVA) is presented NTVA is a neural interpretation of C Bundesen's (1990) theory of visual attention (TVA) In NTVA, visual processing capacity is distributed across stimuli by dynamic remapping of receptive fields of cortical cells such that more processing resources (cells) are devoted to behaviorally important objects than to less important ones By use of the same basic equations used in TVA, NTVA accounts for a wide range of known attentional effects in human performance (reaction times and error rates) and a wide range of effects observed in firing rates of single cells in the primate visual system NTVA provides a mathematical framework to unify the 2 fields of research--formulas bridging cognition and neurophysiology
TL;DR: The first demonstration of concurrent enhanced and decreased performance in autism on the same visuo-spatial static task is presented, wherein the only factor dichotomizing performance was the neural complexity required to discriminate grating orientation.
Abstract: Visuo-perceptual processing in autism is characterized by intact or enhanced performance on static spatial tasks and inferior performance on dynamic tasks, suggesting a deficit of dorsal visual stream processing in autism. However, previous findings by Bertone et al. indicate that neuro-integrative mechanisms used to detect complex motion, rather than motion perception per se, may be impaired in autism. We present here the first demonstration of concurrent enhanced and decreased performance in autism on the same visuo-spatial static task, wherein the only factor dichotomizing performance was the neural complexity required to discriminate grating orientation. The ability of persons with autism was found to be superior for identifying the orientation of simple, luminance-defined (or first-order) gratings but inferior for complex, texture-defined (or second-order) gratings. Using a flicker contrast sensitivity task, we demonstrated that this finding is probably not due to abnormal information processing at a sub-cortical level (magnocellular and parvocellular functioning). Together, these findings are interpreted as a clear indication of altered low-level perceptual information processing in autism, and confirm that the deficits and assets observed in autistic visual perception are contingent on the complexity of the neural network required to process a given type of visual stimulus. We suggest that atypical neural connectivity, resulting in enhanced lateral inhibition, may account for both enhanced and decreased low-level information processing in autism.
TL;DR: The authors found that the auditory modality displayed a quantitative learning advantage compared with vision and touch, and discovered qualitative learning biases among the senses: Primarily, audition afforded better learning for the final part of input sequences.
Abstract: The authors investigated the extent to which touch, vision, and audition mediate the processing of statistical regularities within sequential input. Few researchers have conducted rigorous comparisons across sensory modalities; in particular, the sense of touch has been virtually ignored. The current data reveal not only commonalities but also modality constraints affecting statistical learning across the senses. To be specific, the authors found that the auditory modality displayed a quantitative learning advantage compared with vision and touch. In addition, they discovered qualitative learning biases among the senses: Primarily, audition afforded better learning for the final part of input sequences. These findings are discussed in terms of whether statistical learning is likely to consist of a single, unitary mechanism or multiple, modality-constrained ones.
TL;DR: Results reveal the relative degree of autonomy of emotional processing in the human amygdala under some unattended conditions, and important limitations to the notion of complete automaticity have been revealed.
TL;DR: It is shown that the relative weighting of vision and proprioception depends both on the sensory modality of the target and on the information content of the visual feedback, and that these factors affect the two stages of planning independently.
Abstract: When planning target-directed reaching movements, human subjects combine visual and proprioceptive feedback to form two estimates of the arm's position: one to plan the reach direction, and another to convert that direction into a motor command. These position estimates are based on the same sensory signals but rely on different combinations of visual and proprioceptive input, suggesting that the brain weights sensory inputs differently depending on the computation being performed. Here we show that the relative weighting of vision and proprioception depends both on the sensory modality of the target and on the information content of the visual feedback, and that these factors affect the two stages of planning independently. The observed diversity of weightings demonstrates the flexibility of sensory integration and suggests a unifying principle by which the brain chooses sensory inputs so as to minimize errors arising from the transformation of sensory signals between coordinate frames.
TL;DR: Both motor and visual experience define visual sensitivity to human action, as seen in sagittal displays of point-light depictions of themselves, their friends, and strangers performing various actions.
Abstract: Human observers demonstrate impressive visual sensitivity to human movement. What defines this sensitivity? If motor experience influences the visual analysis of action, then observers should be most sensitive to their own movements. If view-dependent visual experience determines visual sensitivity to human movement, then observers should be most sensitive to the movements of their friends. To test these predictions, participants viewed sagittal displays of point-light depictions of themselves, their friends, and strangers performing various actions. In actor identification and discrimination tasks, sensitivity to one's own motion was highest. Visual sensitivity to friends', but not strangers', actions was above chance. Performance was action dependent. Control studies yielded chance performance with inverted and static displays, suggesting that form and low-motion cues did not define performance. These results suggest that both motor and visual experience define visual sensitivity to human action.
TL;DR: Functional activity in the visual cortex and amygdala are measured with fMRI while selected fearful and control participants view a range of neutral, emotionally arousing, and fear-relevant pictures to suggest an individually-sensitive, positive linear relationship between the arousing quality of visual stimuli and activation in amygdala and ventral visual cortex.
TL;DR: A novel Perception and Attention Deficit (PAD) model for RCVH is proposed, suggesting that a combination of impaired attentional binding and poor sensory activation of a correct proto-object, in conjunction with a relatively intact scene representation, allows the intrusion of a hallucinatory proto- object into a scene perception.
Abstract: As many as two million people in the United Kingdom repeatedly see people, animals, and objects that have no objective reality. Hallucinations on the border of sleep, dementing illnesses, delirium, eye disease, and schizophrenia account for 90% of these. The remainder have rarer disorders. We review existing models of recurrent complex visual hallucinations (RCVH) in the awake person, including cortical irritation, cortical hyperexcitability and cortical release, top-down activation, misperception, dream intrusion, and interactive models. We provide evidence that these can neither fully account for the phenomenology of RCVH, nor for variations in the frequency of RCVH in different disorders. We propose a novel Perception and Attention Deficit (PAD) model for RCVH. A combination of impaired attentional binding and poor sensory activation of a correct proto-object, in conjunction with a relatively intact scene representation, bias perception to allow the intrusion of a hallucinatory proto-object into a scene perception. Incorporation of this image into a context-specific hallucinatory scene representation accounts for repetitive hallucinations. We suggest that these impairments are underpinned by disturbances in a lateral frontal cortex-ventral visual stream system. We show how the frequency of RCVH in different diseases is related to the coexistence of attentional and visual perceptual impairments; how attentional and perceptual processes can account for their phenomenology; and that diseases and other states with high rates of RCVH have cholinergic dysfunction in both frontal cortex and the ventral visual stream. Several tests of the model are indicated, together with a number of treatment options that it generates.
TL;DR: Within superior temporal sulcus, a patchy organization of regions is activated in response to auditory, visual and multisensory stimuli, suggesting that it is an anatomical substrate for mult isensory integration.
TL;DR: High-resolution functional magnetic resonance imaging is used in conjunction with a new binocular rivalry stimulus to show that signals recorded from the human lateral geniculate nucleus (LGN) exhibit eye-specific suppression during rivalry.
Abstract: When our eyes are presented with incompatible images, our conscious perception fluctuates spontaneously between each monocular view The nature of the resulting ‘binocular rivalry’, and how the brain resolves it, is the subject of a long-standing debate that touches on fundamental aspects of human cognition such as attention and selection Now a neural signature characteristic for binocular rivalry has been identified, at the very earliest stages of visual processing, in the human lateral geniculate nucleus (LGN) This region of the brain contains cells that respond only to stimulation of one or other eye, and the signals in the LGN closely reflect the perceptual dominance seen during binocular rivalry When dissimilar images are presented to the two eyes, they compete for perceptual dominance so that each image is visible in turn for a few seconds while the other is suppressed Such binocular rivalry is associated with relative suppression of local, eye-based representations1,2,3,4 that can also be modulated by high-level influences such as perceptual grouping3,5,6 However, it is currently unclear how early in visual processing the suppression of eye-based signals can occur Here we use high-resolution functional magnetic resonance imaging (fMRI) in conjunction with a new binocular rivalry stimulus to show that signals recorded from the human lateral geniculate nucleus (LGN) exhibit eye-specific suppression during rivalry Regions of the LGN that show strong eye-preference independently show strongly reduced activity during binocular rivalry when the stimulus presented in their preferred eye is perceptually suppressed The human LGN is thus the earliest stage of visual processing that reflects eye-specific dominance and suppression
TL;DR: The nature of a processing system in which such a dual use of early visual cortex (in perception and in imagery) makes sense is outlined.
Abstract: One theory of visual mental imagery posits that early visual cortex is also used to support representations during imagery. This claim is important because it bears on the "imagery debate": Early visual cortex supports depictive representations during perception, not descriptive ones. Thus, if such cortex also plays a functional role in imagery, this is strong evidence that imagery does not rely exclusively on the same sorts of representations that underlie language. The present article first outlines the nature of a processing system in which such a dual use of early visual cortex (in perception and in imagery) makes sense. Following this, literature bearing on the claim that early visual cortex is used in visual mental imagery is reviewed, and key issues are discussed.
TL;DR: The specificity of the processing mechanisms required to construct simulations during language comprehension are explored, and it is suggested that these mechanisms can be quite specific.
TL;DR: The results suggest that means are computed automatically and in parallel after an initial preattentive segregation by color, and that sets segregated by location gave mean discrimination thresholds for size that were as accurate as sets segregation by location.
TL;DR: In this article, the authors found that visual speech perception activated the primary auditory cortex in nine subjects, with activation in seven of them extending to the area of the left Heschl's gyrus.
Abstract: Recent studies have yielded contradictory evidence on whether visual speech perception (watching articulatory gestures) can activate the human primary auditory cortex. To circumvent confounds due to inter-individual anatomical variation, we defined our subjects' Heschl's gyri and assessed blood oxygenation-dependent signal changes at 3 T within this confined region during visual speech perception and observation of moving circles. Visual speech perception activated Heschl's gyri in nine subjects, with activation in seven of them extending to the area of primary auditory cortex. Activation was significantly stronger during visual speech perception than during observation of the moving circles. Further, a significant hemisphere by stimulus interaction occurred, suggesting left Heschl's gyrus specialization for visual speech processing.
TL;DR: The results have implications for the information needed to scale egocentric distance in the real-world and reduce the support for the hypothesis that a limited field of view or imperfections in binocular image presentation are the cause of the underestimation seen with HMDs.
Abstract: We carried out three experiments to examine the influence of field of view and binocular viewing restrictions on absolute distance perception in real-world indoor environments. Few of the classical visual cues provide direct information for accurate absolute distance judgments to points in the environment beyond about 2 m from the viewer. Nevertheless, in previous work it has been found that visually directed walking tasks reveal accurate distance estimations in full-cue real-world environments to distances up to 20 m. In contrast, the same tasks in virtual environments produced with head-mounted displays (HMDs) show large compression of distance. Field of view and binocular viewing are common limitations in research with HMDs, and have been rarely studied under full pictorial-cue conditions in the context of distance perception in the real-world. Experiment 1 showed that the view of one's body and feet on the floor was not necessary for accurate distance perception. In experiment 2 we manipulated the horizontal and the vertical field of view along with head rotation and found that a restricted field of view did not affect the accuracy of distance estimations when head movement was allowed. Experiment 3 showed that performance with monocular viewing was equal to that with binocular viewing. These results have implications for the information needed to scale egocentric distance in the real-world and reduce the support for the hypothesis that a limited field of view or imperfections in binocular image presentation are the cause of the underestimation seen with HMDs.
TL;DR: The role of subcortical processing during binocular rivalry was studied in this article, where the LGN was found to be an early gatekeeper of visual awareness and showed evidence for a functional role in binocular competition.
Abstract: When dissimilar images are presented to the two eyes, they compete for perceptual dominance so that only one image is visible at a time while the other one is suppressed. Neural correlates of such binocular rivalry have been found at multiple stages of visual processing, including striate and extrastriate visual cortex. However, little is known about the role of subcortical processing during binocular rivalry. Here we used fMRI to measure neural activity in the human LGN while subjects viewed contrast-modulated gratings presented dichoptically. Neural activity in the LGN correlated strongly with the subjects' reported percepts, such that activity increased when a high-contrast grating was perceived and decreased when a low-contrast grating was perceived. Our results provide evidence for a functional role of the LGN in binocular rivalry and suggest that the LGN, traditionally viewed as the gateway to the visual cortex, may be an early gatekeeper of visual awareness.
TL;DR: In two adult MD subjects with extensive bilateral central retinal lesions, it is found that parts of visual cortex that normally respond only to central visual stimuli are strongly activated by peripheral stimuli.
Abstract: Macular degeneration (MD), the leading cause of visual impairment in the developed world, damages the central retina, often obliterating foveal vision and severely disrupting everyday tasks such as reading, driving, and face recognition. In such cases, the macular damage eliminates the normal retinal input to a large region of visual cortex, comprising tens of square centimeters of surface area in each hemisphere, which is normally responsive only to foveal stimuli. Using functional magnetic resonance imaging, we asked whether this deprived cortex simply becomes inactive in subjects with MD, or whether it takes on new functional properties. In two adult MD subjects with extensive bilateral central retinal lesions, we found that parts of visual cortex (including primary visual cortex) that normally respond only to central visual stimuli are strongly activated by peripheral stimuli. Such activation was not observed (1) with visual stimuli presented to the position of the former fovea and (2) in control subjects with visual stimuli presented to corresponding parts of peripheral retina. These results demonstrate large-scale reorganization of visual processing in MD and will likely prove important in any effort to develop new strategies for rehabilitation of MD subjects.
TL;DR: Investigating whether visual attention can modulate neural responses to other components of a multisensory object defined by synchronous, but spatially disparate, auditory and visual stimuli found that the brain's response to task-irrelevant sounds occurring synchronously with a visual stimulus from a different location was larger when that accompanying visual stimulus was attended versus unattended.
Abstract: Attending to a stimulus is known to enhance the neural responses to that stimulus. Recent experiments on visual attention have shown that this modulation can have object-based characteristics, such that, when certain parts of a visual object are attended, other parts automatically also receive enhanced processing. Here, we investigated whether visual attention can modulate neural responses to other components of a multisensory object defined by synchronous, but spatially disparate, auditory and visual stimuli. The audiovisual integration of such multisensory stimuli typically leads to mislocalization of the sound toward the visual stimulus (ventriloquism illusion). Using event-related potentials and functional MRI, we found that the brain's response to task-irrelevant sounds occurring synchronously with a visual stimulus from a different location was larger when that accompanying visual stimulus was attended versus unattended. The event-related potential effect consisted of sustained, frontally distributed, brain activity that emerged relatively late in processing, an effect resembling attention-related enhancements seen at earlier latencies during intramodal auditory attention. Moreover, the functional MRI data confirmed that the effect included specific enhancement of activity in auditory cortex. These findings indicate that attention to one sensory modality can spread to encompass simultaneous signals from another modality, even when they are task-irrelevant and from a different location. This cross-modal attentional spread appears to reflect an object-based, late selection process wherein spatially discrepant auditory stimulation is grouped with synchronous attended visual input into a multisensory object, resulting in the auditory information being pulled into the attentional spotlight and bestowed with enhanced processing.
TL;DR: The participants in all three experiments were more likely to report the stimuli as being simultaneous when they originated from the same spatial position than when they came from different positions, demonstrating that the apparent perception of multisensory simultaneity is dependent on the relative spatial position from which stimuli are presented.
Abstract: The relative spatiotemporal correspondence between sensory events affects multisensory integration across a variety of species; integration is maximal when stimuli in different sensory modalities are presented from approximately the same position at about the same time. In the present study, we investigated the influence of spatial and temporal factors on audio-visual simultaneity perception in humans. Participants made unspeeded simultaneous versus successive discrimination responses to pairs of auditory and visual stimuli presented at varying stimulus onset asynchronies from either the same or different spatial positions using either the method of constant stimuli (Experiments 1 and 2) or psychophysical staircases (Experiment 3). The participants in all three experiments were more likely to report the stimuli as being simultaneous when they originated from the same spatial position than when they came from different positions, demonstrating that the apparent perception of multisensory simultaneity is dependent on the relative spatial position from which stimuli are presented.
TL;DR: A masking paradigm is used to determine how information accumulates over time during high-level categorisation tasks, implying that processing at each stage in the visual system is remarkably rapid, with information accumulating almost continuously following the onset of activation.
TL;DR: An ideal observer is developed (derived using Bayes' rule), and human judgements are compared with those of the ideal observer for this task, showing that the sound-induced flash illusion is an epiphenomenon of this general, statistically optimal strategy.
Abstract: Recently, it has been shown that visual perception can be radically altered by signals of other modalities. For example, when a single flash is accompanied by multiple auditory beeps, it is often perceived as multiple flashes. This effect is known as the sound-induced flash illusion. In order to investigate the principles underlying this illusion, we developed an ideal observer (derived using Bayes' rule), and compared human judgements with those of the ideal observer for this task. The human observer's performance was highly consistent with that of the ideal observer in all conditions ranging from no interaction, to partial integration, to complete integration, suggesting that the rule used by the nervous system to decide when and how to combine auditory and visual signals is statistically optimal. Our findings show that the sound-induced flash illusion is an epiphenomenon of this general, statistically optimal strategy.