TL;DR: Results impose very serious constraints on the sorts of processing model that can be invoked and demonstrate that face-selective behavioral responses can be generated extremely rapidly.
Abstract: Previous work has demonstrated that the human visual system can detect animals in complex natural scenes very efficiently and rapidly. In particular, using a saccadic choice task, H. Kirchner and S. J. Thorpe ( 2006) found that when two images are simultaneously flashed in the left and right visual fields, saccades toward the side with an animal can be initiated in as little as 120-130 ms. Here we show that saccades toward human faces are even faster, with the earliest reliable saccades occurring in just 100-110 ms, and mean reaction times of roughly 140 ms. Intriguingly, it appears that these very fast saccades are not completely under instructional control, because when faces were paired with photographs of vehicles, fast saccades were still biased toward faces even when the subject was targeting vehicles. Finally, we tested whether these very fast saccades might only occur in the simple case where the images are presented left and right of fixation by showing they also occur when the images are presented above and below fixation. Such results impose very serious constraints on the sorts of processing model that can be invoked and demonstrate that face-selective behavioral responses can be generated extremely rapidly.
TL;DR: The stimuli types often used in laboratory experiments, static images and professionally cut material, are not very representative of natural viewing behavior, and eye movements on Hollywood movies are significantly more coherent than those on natural movies.
Abstract: How similar are the eye movement patterns of different subjects when free viewing dynamic natural scenes? We collected a large database of eye movements from 54 subjects on 18 high-resolution videos of outdoor scenes and measured their variability using the Normalized Scanpath Saliency, which we extended to the temporal domain. Even though up to about 80% of subjects looked at the same image region in some video parts, variability usually was much greater. Eye movements on natural movies were then compared with eye movements in several control conditions. "Stop-motion" movies had almost identical semantic content as the original videos but lacked continuous motion. Hollywood action movie trailers were used to probe the upper limit of eye movement coherence that can be achieved by deliberate camera work, scene cuts, etc. In a "repetitive" condition, subjects viewed the same movies ten times each over the course of 2 days. Results show several systematic differences between conditions both for general eye movement parameters such as saccade amplitude and fixation duration and for eye movement variability. Most importantly, eye movements on static images are initially driven by stimulus onset effects and later, more so than on continuous videos, by subject-specific idiosyncrasies; eye movements on Hollywood movies are significantly more coherent than those on natural movies. We conclude that the stimuli types often used in laboratory experiments, static images and professionally cut material, are not very representative of natural viewing behavior. All stimuli and gaze data are publicly available at http://www.inb.uni-luebeck.de/tools-demos/gaze.
TL;DR: The quick CSF (qCSF) method is presented, a Bayesian adaptive procedure that applies a strategy developed to estimate multiple parameters of the psychometric function to improve sensitivity estimates across all frequencies by directly estimating CSF parameters.
Abstract: The contrast sensitivity function (CSF) predicts functional vision better than acuity, but long testing times prevent its psychophysical assessment in clinical and practical applications. This study presents the quick CSF (qCSF) method, a Bayesian adaptive procedure that applies a strategy developed to estimate multiple parameters of the psychometric function (A. B. Cobo-Lewis, 1996; L. L. Kontsevich & C. W. Tyler, 1999). Before each trial, a one-step-ahead search finds the grating stimulus (defined by frequency and contrast) that maximizes the expected information gain (J. V. Kujala & T. J. Lukka, 2006; L. A. Lesmes et al., 2006), about four CSF parameters. By directly estimating CSF parameters, data collected at one spatial frequency improves sensitivity estimates across all frequencies. A psychophysical study validated that CSFs obtained with 100 qCSF trials ( approximately 10 min) exhibited good precision across spatial frequencies (SD < 2-3 dB) and excellent agreement with CSFs obtained independently (mean RMSE = 0.86 dB). To estimate the broad sensitivity metric provided by the area under the log CSF (AULCSF), only 25 trials were needed to achieve a coefficient of variation of 15-20%. The current study demonstrates the method's value for basic and clinical investigations. Further studies, applying the qCSF to measure wider ranges of normal and abnormal vision, will determine how its efficiency translates to clinical assessment.
TL;DR: In this article, subjects were repeatedly presented with a background motion signal so weak that its direction was not visible; the invisible motion was an irrelevant background to the central task that engaged the subject's attention.
Abstract: The brain is able to adapt rapidly and continually to the surrounding environment, becoming increasingly sensitive to important and frequently encountered stimuli1,2,3,4. It is often claimed that this adaptive learning is highly task-specific, that is, we become more sensitive to the critical signals in the tasks we attend to5,6,7,8,9,10,11,12,13,14,15. Here, we show a new type of perceptual learning, which occurs without attention, without awareness and without any task relevance. Subjects were repeatedly presented with a background motion signal so weak that its direction was not visible; the invisible motion was an irrelevant background to the central task that engaged the subject's attention. Despite being below the threshold of visibility and being irrelevant to the central task, the repetitive exposure improved performance specifically for the direction of the exposed motion when tested in a subsequent suprathreshold test. These results suggest that a frequently presented feature sensitizes the visual system merely owing to its frequency, not its relevance or salience.
TL;DR: By linearly combining terms for goals and obstacles, one could predict whether participants adopt a route to the left or right of an obstacle to reach a go, making explicit path planning unnecessary.
Abstract: The authors investigated the dynamics of steering and obstacle avoidance, with the aim of predicting routes through complex scenes. Participants walked in a virtual environment toward a goal (Experiment 1) and around an obstacle (Experiment 2) whose initial angle and distance varied. Goals and obstacles behave as attractors and repellers of heading, respectively, whose strengths depend on distance. The observed behavior was modeled as a dynamical system in which angular acceleration is a function of goal and obstacle angle and distance. By linearly combining terms for goals and obstacles, one could predict whether participants adopt a route to the left or right of an obstacle to reach a go (Experiment 3). Route selection may emerge from on-line steering dynamics, making explicit path planning unnecessary.
TL;DR: In this article, the relative weights of visual and vestibular cues during self-motion were investigated in a 2-interval forced choice task and participants were asked to judge in which of the two intervals they moved more to the right.
Abstract: Self-motion through an environment involves a composite of signals such as visual and vestibular cues. Building upon previous results showing that visual and vestibular signals combine in a statistically optimal fashion, we investigated the relative weights of visual and vestibular cues during self-motion. This experiment was comprised of three experimental conditions: vestibular alone, visual alone (with four different standard heading values), and visual–vestibular combined. In the combined cue condition, inter-sensory conflicts were introduced ($ = T6- or T10-). Participants performed a 2-interval forced choice task in all conditions and were asked to judge in which of the two intervals they moved more to the right. The cue-conflict condition revealed the relative weights associated with each modality. We found that even when there was a relatively large conflict between the visual and vestibular cues, participants exhibited a statistically optimal reduction in variance. On the other hand, we found that the pattern of results in the unimodal conditions did not predict the weights in the combined cue condition. Specifically, visual–vestibular cue combination was not predicted solely by the reliability of each cue, but rather more weight was given to the vestibular cue.
TL;DR: The results suggest that saccade targeting and, by inference, attentional selection in scenes is object-based.
Abstract: Two contrasting views of visual attention in scenes are the visual salience and the cognitive relevance hypotheses. They fundamentally differ in their conceptualization of the visuospatial representation over which attention is directed. According to the saliency model, this representation is image-based, while the cognitive relevance framework advocates an objectbased representation. Previous research has shown that (1) viewers prefer to look at objects over background and that (2) the saliency model predicts human fixation locations significantly better than chance. However, it could be that saliency mainly acts through objects. To test this hypothesis, we investigated where people fixate within real objects and saliency protoobjects. To this end, we recorded eye movements of human observers while they inspected photographs of natural scenes under different task instructions. We found a preferred viewing location (PVL) close to the center of objects within naturalistic scenes. Compared to the PVL for real objects, there was less evidence for a PVL for human fixations within saliency protoobjects. There was no evidence for a PVL when only saliency proto-objects that did not spatially overlap with annotated real objects were analyzed. The results suggest that saccade targeting and, by inference, attentional selection in scenes is object-based.
TL;DR: Strong selectivities in distinct but adjacent regions in the fusiform gyrus for only faces in one region ( the FFA*) and only bodies in the other (the FBA*) are demonstrated.
Abstract: Recent reports of a high response to bodies in the fusiform face area (FFA) challenge the idea that the FFA is exclusively selective for face stimuli. We examined this claim by conducting a functional magnetic resonance imaging experiment at both standard (3.125 x 3.125 x 4.0 mm) and high resolution (1.4 x 1.4 x 2.0 mm). In both experiments, regions of interest (ROIs) were defined using data from blocked localizer runs. Within each ROI, we measured the mean peak response to a variety of stimulus types in independent data from a subsequent event-related experiment. Our localizer scans identified a fusiform body area (FBA), a body-selective region reported recently by Peelen and Downing (2005) that is anatomically distinct from the extrastriate body area. The FBA overlapped with and was adjacent to the FFA in all but two participants. Selectivity of the FFA to faces and FBA to bodies was stronger for the high-resolution scans, as expected from the reduction in partial volume effects. When new ROIs were constructed for the high-resolution experiment by omitting the voxels showing overlapping selectivity for both bodies and faces in the localizer scans, the resulting FFA* ROI showed no response above control objects for body stimuli, and the FBA* ROI showed no response above control objects for face stimuli. These results demonstrate strong selectivities in distinct but adjacent regions in the fusiform gyrus for only faces in one region (the FFA*) and only bodies in the other (the FBA*).
TL;DR: In this article, an individual-differences analysis revealed a significant correlation between posterior parietal cortex (PPC) activity and individuals' VSTM storage capacity, and a region-of-interest analysis indicated that other brain regions, particularly visual occipital cortex, may contribute to individual differences in the capacity of visual short-term memory.
Abstract: Humans show a severe capacity limit in the number of objects they can store in visual short-term memory (VSTM). We recently demonstrated with functional magnetic resonance imaging that VSTM storage capacity estimated in averaged group data correlated strongly with posterior parietal/superior occipital cortex activity (Todd & Marois, 2004). However, individuals varied widely in their VSTM capacity. Here, we examined the neural basis of these individual differences. A voxelwise, individualdifferences analysis revealed a significant correlation between posterior parietal cortex (PPC) activity and individuals’ VSTM storage capacity. In addition, a region-of-interest analysis indicated that other brain regions, particularly visual occipital cortex, may contribute to individual differences in VSTM capacity. Thus, although not ruling out contributions from other brain regions, the individual-differences approach supports a key role for the PPC in VSTM by demonstrating that its activity level predicts individual differences in VSTM storage capacity.
TL;DR: It is suggested that pre-attentive estimation mechanisms works at all ranges, but in the subitizing range, attentive mechanisms also come into play.
Abstract: The numerosity of small numbers of objects, up to about four, can be rapidly appraised without error, a phenomenon known as subitizing. Larger numbers can either be counted, accurately but slowly, or estimated, rapidly but with errors. There has been some debate as to whether subitizing uses the same or different mechanisms than those of higher numerical ranges and whether it requires attentional resources. We measure subjects' accuracy and precision in making rapid judgments of numerosity for target numbers spanning the subitizing and estimation ranges while manipulating the attentional load, both with a spatial dual task and the "attentional blink" dual-task paradigm. The results of both attentional manipulations were similar. In the high-load attentional condition, Weber fractions were similar in the subitizing (2-4) and estimation (5-7) ranges (10-15%). In the low-load and single-task condition, Weber fractions substantially improved in the subitizing range, becoming nearly error-free, while the estimation range was relatively unaffected. The results show that the mechanisms operating over the subitizing and estimation ranges are not identical. We suggest that pre-attentive estimation mechanisms works at all ranges, but in the subitizing range, attentive mechanisms also come into play.
TL;DR: A set of novel abstract, visually diverse images yielded robust and consistent visual preferences, and yet abstract images yielded much lower across observer agreement in preferences than did real-world images, suggesting that visual preferences are typically driven by the semantic content of stimuli.
Abstract: How individual are visual preferences? For real-world scenes, there is high agreement in observer's preference ratings. This could be driven by visual attributes of the images but also by non-visual associations, since those are common to most individuals. To investigate this, we developed a set of novel abstract, visually diverse images. At the individual observer level both abstract and real-world images yielded robust and consistent visual preferences, and yet abstract images yielded much lower across observer agreement in preferences than did real-world images. This suggests that visual preferences are typically driven by the semantic content of stimuli, and that shared semantic interpretations then lead to shared preferences. Further experiments showed that highly individual preferences can nevertheless emerge also for real-world scenes, in contexts which de-emphasize their semantic associations.
TL;DR: A significant attentional reduction of the critical distance is found (i.e., the target-flankers distance at which the flankers no longer interfere with target identification) which suggests that attention reduces the spatial extent of crowding.
Abstract: The identification of a peripheral target surrounded by flankers is often harder than the identification of an identical isolated target. This study examined whether this crowding phenomenon, and particularly its spatial extent, is affected by the allocation of spatial attention to the target location. We measured orientation identification of a rotated T with and without flankers. The distance between the target and the flankers and their eccentricity varied systematically. We manipulated attention via peripheral precues: in the cued condition, a dot indicated the target location prior to its onset. On the neutral condition, a central disk conveyed no information regarding the target location (Experiments 1-2), and on the invalid condition (Experiment 3), an invalid cue attracted attention to a nontarget location. We found, across all experiments, at all eccentricities, a significant attentional enhancement of identification accuracy. Most importantly, we found a significant attentional reduction of the critical distance (i.e., the target-flankers distance at which the flankers no longer interfere with target identification). These attentional effects were found regardless of the presence or absence of a backward mask and whether the attentional cue was informative or not. These findings suggest that attention reduces the spatial extent of crowding.
TL;DR: This work manipulated independently the specificity of the search target template and the usefulness of contextual constraint in an object search task to investigate how the visual system combines multiple types of top-down information to facilitate search.
Abstract: Eye movements can be guided by various types of information in real-world scenes. Here we investigated how the visual system combines multiple types of top-down information to facilitate search. We manipulated independently the specificity of the search target template and the usefulness of contextual constraint in an object search task. An eye tracker was used to segment search time into three behaviorally defined epochs so that influences on specific search processes could be identified. The results support previous studies indicating that the availability of either a specific target template or scene context facilitates search. The results also show that target template and contextual constraints combine additively in facilitating search. The results extend recent eye guidance models by suggesting the manner in which our visual system utilizes multiple types of top-down information.
TL;DR: While the quantity of stored objects was largely unaffected by increasing the number of features, the precision of these representations dramatically decreased, and this selective deterioration in object precision depended on the multiple features being contained within the same objects.
Abstract: An influential theory suggests that integrated objects, rather than individual features, are the fundamental units that limit our capacity to temporarily store visual information (S. J. Luck & E. K. Vogel, 1997). Using a paradigm that independently estimates the number and precision of items stored in working memory (W. Zhang & S. J. Luck, 2008), here we show that the storage of features is not cost-free. The precision and number of objects held in working memory was estimated when observers had to remember either the color, the orientation, or both the color and orientation of simple objects. We found that while the quantity of stored objects was largely unaffected by increasing the number of features, the precision of these representations dramatically decreased. Moreover, this selective deterioration in object precision depended on the multiple features being contained within the same objects. Such fidelity costs were even observed with change detection paradigms when those paradigms placed demands on the precision of the stored visual representations. Taken together, these findings not only demonstrate that the maintenance of integrated features is costly; they also suggest that objects and features affect visual working memory capacity differently.
TL;DR: Targeted high-resolution fMRI measurements of the lateral cortex and multivoxel pattern analysis show that the response to seven categories of dynamic facial expressions can be decoded in both the posterior and anterior superior temporal sulcus, suggesting that distributed representations in the pSTS could underlie the perception of facial expressions.
Abstract: Previous research on the superior temporal sulcus (STS) has shown that it responds more to facial expressions than to neutral faces. Here, we extend our understanding of the STS in two ways. First, using targeted high-resolution fMRI measurements of the lateral cortex and multivoxel pattern analysis, we show that the response to seven categories of dynamic facial expressions can be decoded in both the posterior STS (pSTS) and anterior STS (aSTS). We were also able to decode patterns corresponding to these expressions in the frontal operculum (FO), a structure that has also been shown to respond to facial expressions. Second, we measured the similarity structure of these representations and found that the similarity structure in the pSTS significantly correlated with the perceptual similarity structure of the expressions. This was the case regardless of whether we used pattern classification or more traditional correlation techniques to extract the neural similarity structure. These results suggest that distributed representations in the pSTS could underlie the perception of facial expressions.
TL;DR: It is found that dilation was influenced by, but not dependent on, the requirement of a button press, and interestingly, dilation occurred when viewers fixated a target but did not report seeing it.
Abstract: It has long been documented that emotional and sensory events elicit a pupillary dilation. Is the pupil response a reliable marker of a visual detection event while viewing complex imagery? In two experiments where viewers were asked to report the presence of a visual target during rapid serial visual presentation (RSVP), pupil dilation was significantly associated with target detection. The amplitude of the dilation depended on the frequency of targets and the time of target presentation relative to the start of the trial. Larger dilations were associated with trials having fewer targets and with targets viewed earlier in the run. We found that dilation was influenced by, but not dependent on, the requirement of a button press. Interestingly, we also found that dilation occurred when viewers fixated a target but did not report seeing it. We will briefly discuss the role of noradrenaline in mediating these pupil behaviors.
TL;DR: The results indicate a finer-grained neural tuning for same-race faces at early stages of processing in both groups of observers, and lead to greater recognition impairment and elicited larger N170 amplitudes compared to inverted other- race faces.
Abstract: Human beings are natural experts at processing faces, with some notable exceptions. Same-race faces are better recognized than other-race faces: the so-called other-race effect (ORE). Inverting faces impairs recognition more than for any other inverted visual object: the so-called face inversion effect (FIE). Interestingly, the FIE is stronger for same- compared to other-race faces. At the electrophysiological level, inverted faces elicit consistently delayed and often larger N170 compared to upright faces. However, whether the N170 component is sensitive to race is still a matter of ongoing debate. Here we investigated the N170 sensitivity to race in the framework of the FIE. We recorded EEG from Western Caucasian and East Asian observers while presented with Western Caucasian, East Asian and African American faces in upright and inverted orientations. To control for potential confounds in the EEG signal that might be evoked by the intrinsic and salient differences in the low-level properties of faces from different races, we normalized their amplitude-spectra, luminance and contrast. No differences on the N170 were observed for upright faces. Critically, inverted same-race faces lead to greater recognition impairment and elicited larger N170 amplitudes compared to inverted other-race faces. Our results indicate a finer-grained neural tuning for same-race faces at early stages of processing in both groups of observers.
Abstract: Oscillations are ubiquitous in electrical recordings of brain activity. While the amplitude of ongoing oscillatory activity is known to correlate with various aspects of perception, the influence of oscillatory phase on perception remains unknown. In particular, since phase varies on a much faster timescale than the more sluggish amplitude fluctuations, phase effects could reveal the fine-grained neural mechanisms underlying perception. We presented brief flashes of light at the individual luminance threshold while EEG was recorded. Although the stimulus on each trial was identical, subjects detected approximately half of the flashes (hits) and entirely missed the other half (misses). Phase distributions across trials were compared between hits and misses. We found that shortly before stimulus onset, each of the two distributions exhibited significant phase concentration, but at different phase angles. This effect was strongest in the theta and alpha frequency bands. In this time-frequency range, oscillatory phase accounted for at least 16% of variability in detection performance and allowed the prediction of performance on the single-trial level. This finding indicates that the visual detection threshold fluctuates over time along with the phase of ongoing EEG activity. The results support the notion that ongoing oscillations shape our perception, possibly by providing a temporal reference frame for neural codes that rely on precise spike timing.
TL;DR: In this article, an anecological valence theory was proposed to explain color preference in humans. And the empirical test provided strong support for this theory: people like colors strongly associated with objects they like (e.g., browns with feces and rottenfood).
Abstract: Color preference is an important aspect of visual experience, butlittle is known about why people in general like some colors morethan others. Previous research suggested explanations based onbiologicaladaptations[HurlbertAC,LingYL(2007)CurrBiol17:623–625] and color-emotions [Ou L-C, Luo MR, Woodcock A, Wright A(2004) Color Res Appl 29:381–389]. In this article we articulate anecological valence theory in which color preferences arise frompeople’s average affective responses to color-associated objects.An empirical test provides strong support for this theory: Peoplelike colors strongly associated with objects they like (e.g., blueswith clear skies and clean water) and dislike colors strongly associ-ated with objects they dislike (e.g., browns with feces and rottenfood).Relativetoalternativetheories,theecologicalvalencetheoryboth fits the data better (even with fewer free parameters) andprovides a more plausible, comprehensive causal explanation ofcolor preferences.
TL;DR: Behavioral studies suggest a prolonged development of face recognition memory profi ciency, consistently reporting greater volume of the right FFA in adults compared to children.
Abstract: The ventral temporal cortex (VTC) in humans includes functionally defined regions that preferentially respond to objects, faces, and places. Recent developmental studies suggest that the face selective region in the fusiform gyrus (‘fusiform face area’, FFA) undergoes a prolonged development involving substantial increases in its volume after 7 years of age. However, the endpoint of this development is not known. Here we used functional magnetic resonance imaging (fMRI) to examine the development of face-, object- and place selective regions in the VTC of adolescents (12–16 year olds) and adults (18–40 year olds). We found that the volume of face selective activations in the right fusiform gyrus was substantially larger in adults than in adolescents, and was positively correlated with age. This development was associated with higher response amplitudes and selectivity for faces in face selective regions of VTC and increased differentiation of the distributed response patterns to faces versus non-face stimuli across the entire VTC. Furthermore, right FFA size was positively correlated with face recognition memory performance, but not with recognition memory of objects or places. In contrast, the volume of object- and place selective cortical regions or their response amplitudes did not change across these age groups. Thus, we found a striking and prolonged development of face selectivity across the VTC during adolescence that was specifically associated with proficiency in face recognition memory. These findings have important implications for theories of development and functional specialization in VTC.
TL;DR: The novel finding that perceived numerosity increases with decreasing luminance, whereas texture density does not, further evidence for independent processing of the two attributes suggest that numerosity judgments can be, and are, made independently of judgments of the density of texture.
Abstract: We have recently suggested that numerosity is a primary sensory attribute, and shown that it is strongly susceptible to adaptation. Here we use the Method of Single Stimuli to show that observers can extract a running average of numerosity of a succession of stimuli to use as a standard of comparison for subsequent stimuli. On separate sessions observers judged whether the perceived numerosity or density of a particular trial was greater or less than the average of previous stimuli. Thresholds were as precise for this task as for explicit comparisons of test with standard stimuli. Importantly, we found no evidence that numerosity judgments are mediated by density. Under all conditions, judgements of numerosity were as precise as those of density. Thresholds in intermingled conditions, where numerosity varied unpredictably with density, were as precise as the blocked thresholds. Judgments in constant-density conditions were more precise thresholds than those in variable-density conditions, and numerosity judgements in conditions of constant-numerosity showed no tendency to follow density. We further report the novel finding that perceived numerosity increases with decreasing luminance, whereas texture density does not, further evidence for independent processing of the two attributes. All these measurements suggest that numerosity judgments can be, and are, made independently of judgments of the density of texture.
TL;DR: In a trio of experiments, a matching procedure generated direct, analogue measures of short-term memory for the spatial frequency of Gabor stimuli, and it appears that these two separable attractors influence distinct processes, with perception being influenced by the non-target stimulus and memorybeing influenced by stimuli seen on previous trials.
Abstract: In a trio of experiments, a matching procedure generated direct, analogue measures of short-term memory for the spatial frequency of Gabor stimuli. Experiment 1 showed that when just a single Gabor was presented for study, a retention interval of just a few seconds was enough to increase the variability of matches, suggesting that noise in memory substantially exceeds that in vision. Experiment 2 revealed that when a pair of Gabors was presented on each trial, the remembered appearance of one of the Gabors was influenced by: (1) the relationship between its spatial frequency and the spatial frequency of the accompanying, task-irrelevant non-target stimulus; and (2) the average spatial frequency of Gabors seen on previous trials. These two influences, which work on very different time scales, were approximately additive in their effects, each operating as an attractor for remembered appearance. Experiment 3 showed that a timely pre-stimulus cue allowed selective attention to curtail the influence of a task-irrelevant non-target, without diminishing the impact of the stimuli seen on previous trials. It appears that these two separable attractors influence distinct processes, with perception being influenced by the non-target stimulus and memory being influenced by stimuli seen on previous trials.
TL;DR: These findings indicate that the neural events that underlie both rivalry and crowding are inaugurated at an early stage of visual processing, because both the threshold-elevation aftereffect and translational motion aftere affect arise, at least in part, from adaptation at the earliest stages of cortical processing.
Abstract: We measured visual-adaptation strength under variations in visual awareness by manipulating phenomenal invisibility of adapting stimuli using binocular rivalry and visual crowding. Results showed that the threshold-elevation aftereffect and the translational motion aftereffect were reduced substantially during binocular rivalry and crowding. Importantly, aftereffect reduction was correlated with the proportion of time that the adapting stimulus was removed from visual awareness. These findings indicate that the neural events that underlie both rivalry and crowding are inaugurated at an early stage of visual processing, because both the threshold-elevation aftereffect and translational motion aftereffect arise, at least in part, from adaptation at the earliest stages of cortical processing. Also, our findings make it necessary to reinterpret previous studies whose results were construed as psychophysical evidence against the direct role of neurons in the primary visual cortex in visual awareness.
TL;DR: As predicted, action gamers showed reduced backward masking and an accompanying training study established the causal role of action game play in this enhancement, and implications are discussed in the context of the faster reaction times and enhanced sensitivity also documented afteraction game play.
Abstract: Action video game play enhances basic visual skills such as crowding acuity and contrast sensitivity (C. S. Green & D. Bavelier, 2007; R. Li, U. Polat, W. Makous, & D. Bavelier, 2009). Here, we ask whether the dynamics of perception may also be altered as a result of playing action games. A backward masking paradigm was used to test the hypothesis that action video game play also alters the temporal dynamics of vision. As predicted, action gamers showed reduced backward masking and an accompanying training study established the causal role of action game play in this enhancement. Implications of this result are discussed in the context of the faster reaction times and enhanced sensitivity also documented after action game play.
TL;DR: It seems that the imprecise inertial estimate was weighed relatively more than the precise visual estimate, compared to the MLI predictions, which concur with other findings of overweighing of inertial cues in synthetic environments.
Abstract: In the present study, we investigated whether the perception of heading of linear self-motion can be explained by Maximum Likelihood Integration (MLI) of visual and non-visual sensory cues. MLI predicts smaller variance for multisensory judgments compared to unisensory judgments. Nine participants were exposed to visual, inertial, or visual–inertial motion conditions in a moving base simulator, capable of accelerating along a horizontal linear track with variable heading. Visual random-dot motion stimuli were projected on a display with a 40° horizontal × 32° vertical field of view (FoV). All motion profiles consisted of a raised cosine bell in velocity. Stimulus heading was varied between 0 and 20°. After each stimulus, participants indicated whether perceived self-motion was straight-ahead or not. We fitted cumulative normal distribution functions to the data as a psychometric model and compared this model to a nested model in which the slope of the multisensory condition was subject to the MLI hypothesis. Based on likelihood ratio tests, the MLI model had to be rejected. It seems that the imprecise inertial estimate was weighed relatively more than the precise visual estimate, compared to the MLI predictions. Possibly, this can be attributed to low realism of the visual stimulus. The present results concur with other findings of overweighing of inertial cues in synthetic environments.
TL;DR: Size of the visual span was highly correlated with changes in reading speed for both lowercase and uppercase letters and for both RSVP and flashcard reading, consistent with the view that slower reading of vertical text is due to a decrease in the size of theVisual span for vertical reading.
Abstract: There are three formats for arranging English text for vertical readingVupright letters arranged vertically (marquee), andhorizontal text rotated 90- clockwise or counterclockwise. Previous research has shown that reading is slower for all threevertical formats than for horizontal text, with marquee being slowest (M. D. Byrne, 2002). It has been proposed that the sizeof the visual spanVthe number of letters recognized with high accuracy without moving the eyesVis a visual factor limitingreading speed. We predicted that reduced visual-span size would be correlated with the slower reading for the three verticalformats. We tested this prediction with uppercase and lowercase letters. Reading performance was measured using twopresentation methods: RSVP (Rapid Serial Visual Presentation) and flashcard (a block of text on four lines). On average,reading speed for horizontal text was 139% faster than marquee text and 81% faster than the rotated texts. Size of thevisual span was highly correlated with changes in reading speed for both lowercase and uppercase letters and for bothRSVP and flashcard reading. Our results are consistent with the view that slower reading of vertical text is due to adecrease in the size of the visual span for vertical reading.Keywords: visual span, letter recognition, reading speed, vertical readingCitation: Yu, D., Park, H., Gerold, D., & Legge, G. E. (2010). Comparing reading speed for horizontal and vertical Englishtext. Journal of Vision, 10(2):21, 1
TL;DR: The object working memory subsystem--not the spatial working memory subset--provides the buffer in which object representations are stored while they undergo mental rotation, and the nature of the information being stored may determine which subsystem stores the information.
Abstract: In mental rotation, a mental representation of an object must be rotated while the actual object remains visible Where is this representation stored while it is being rotated? To answer this question, observers were asked to perform a mental rotation task during the delay interval of a visual working memory task When the working memory task required the storage of object features, substantial bidirectional interference was observed between the memory and rotation tasks, and the interference increased with the degree of rotation However, rotation-dependent interference was not observed when a spatial working memory task was used instead of an object working memory task Thus, the object working memory subsystem—not the spatial working memory subsystem—provides the buffer in which object representations are stored while they undergo mental rotation More broadly, the nature of the information being stored—not the nature of the operations performed on this information—may determine which subsystem stores the information
TL;DR: High-resolution analysis showed that SFM objects and line drawings were processed in separate but adjacent sub-regions in SLO, suggesting that SLO codes object shape but retains topographic segregation based on shape cues.
Abstract: Shape and motion are complementary visual features and each appears to be processed in unique cortical areas. However, object motion is a powerful cue for the perception of three-dimensional (3-D) shape, implying that the two types of information — motion and form — are well integrated. We conducted a series of fMRI experiments aimed at identifying the brain regions involved in inferring 3-D shape from motion cues. For each subject, we identified regions in occipital–temporal cortex that were activated when perceiving: (i) motion in unstructured random-dot patterns, (ii) 2-D and 3-D line drawing shapes, and (iii) 3-D shapes defined by motion cues (shape-from-motion, SFM). We found closely adjacent areas in the lateral occipital region activated by random motion and line-drawing shapes. In addition, we found that the SFM stimuli produced a greater MRI signal in only one of the areas identified with the random motion and line-drawing stimuli: the superior lateral occipital (SLO) region. High-resolution analysis showed that SFM objects and line drawings were processed in separate but adjacent sub-regions in SLO, suggesting that SLO codes object shape but retains topographic segregation based on shape cues. Expanding the analysis to the entire cortex identified ap arietal area that had overlapping activation for both SFM and line drawings and increased MRI signal for 3-D versus 2-D shapes, suggesting this area is important for processing shape information.