Top 697 papers published in the topic of Visual perception in 2021

Showing papers on "Visual perception published in 2021"

Journal Article•10.1038/S41586-020-03171-X•

Survey of spiking in the mouse visual system reveals functional hierarchy

[...]

Joshua H. Siegle¹, Xiaoxuan Jia¹, Séverine Durand¹, Sam Gale¹, Corbett Bennett¹, Nile Graddis¹, Greggory Heller¹, Tamina K. Ramirez¹, Hannah Choi², Hannah Choi¹, Jennifer Luviano¹, Peter A. Groblewski¹, Ruweida Ahmed¹, Anton Arkhipov¹, Amy Bernard¹, Yazan N. Billeh¹, Dillan Brown¹, Michael A. Buice¹, Nicolas Cain¹, Shiella Caldejon¹, Linzy Casal¹, Andrew Cho¹, Maggie Chvilicek¹, Timothy C. Cox³, Kael Dai¹, Daniel J. Denman¹, Daniel J. Denman⁴, Saskia E. J. de Vries¹, Roald Dietzman¹, Luke Esposito¹, Colin Farrell¹, David Feng¹, John Galbraith¹, Marina Garrett¹, Emily Gelfand¹, Nicole Hancock¹, Julie A. Harris¹, Robert Howard¹, Brian Hu¹, Ross Hytnen¹, Ramakrishnan Iyer¹, Erika Jessett¹, Katelyn Johnson¹, India Kato¹, Justin T. Kiggins¹, Sophie Lambert¹, Jérôme Lecoq¹, Peter Ledochowitsch¹, Jung Hoon Lee¹, Arielle Leon¹, Yang Li¹, Elizabeth Liang¹, Fuhui Long¹, Kyla Mace¹, Jose Melchior¹, Daniel Millman¹, Tyler Mollenkopf¹, Chelsea Nayan¹, Lydia Ng¹, Kiet Ngo¹, Thuyahn Nguyen¹, Philip R. Nicovich¹, Kat North¹, Gabriel Koch Ocker¹, Douglas R. Ollerenshaw¹, Michael Oliver¹, Marius Pachitariu, Jed Perkins¹, Melissa Reding¹, David Reid¹, Miranda Robertson¹, Kara Ronellenfitch¹, Sam Seid¹, Cliff Slaughterbeck¹, Michelle Stoecklin¹, David Sullivan¹, Ben Sutton¹, Jackie Swapp¹, Carol L. Thompson¹, Kristen Turner¹, Wayne Wakeman¹, Jennifer D. Whitesell¹, Derric Williams¹, Ali Williford¹, R.D. Young¹, Hongkui Zeng¹, Sarah A. Naylor¹, John W. Phillips¹, R. Clay Reid¹, Stefan Mihalas¹, Shawn R. Olsen¹, Christof Koch¹ - Show less +88 more•Institutions (4)

Allen Institute for Brain Science¹, University of Washington², University of Missouri–Kansas City³, University of Colorado Denver⁴

20 Jan 2021-Nature

TL;DR: In this paper, a large-scale dataset of tens of thousands of units in six cortical and two thalamic regions in the brains of mice responding to a battery of visual stimuli is presented.

...read moreread less

Abstract: The anatomy of the mammalian visual system, from the retina to the neocortex, is organized hierarchically1. However, direct observation of cellular-level functional interactions across this hierarchy is lacking due to the challenge of simultaneously recording activity across numerous regions. Here we describe a large, open dataset-part of the Allen Brain Observatory2-that surveys spiking from tens of thousands of units in six cortical and two thalamic regions in the brains of mice responding to a battery of visual stimuli. Using cross-correlation analysis, we reveal that the organization of inter-area functional connectivity during visual stimulation mirrors the anatomical hierarchy from the Allen Mouse Brain Connectivity Atlas3. We find that four classical hierarchical measures-response latency, receptive-field size, phase-locking to drifting gratings and response decay timescale-are all correlated with the hierarchy. Moreover, recordings obtained during a visual task reveal that the correlation between neural activity and behavioural choice also increases along the hierarchy. Our study provides a foundation for understanding coding and signal propagation across hierarchically organized cortical and thalamic visual areas.

...read moreread less

446 citations

Journal Article•10.1038/S41586-021-03390-W•

Shared mechanisms underlie the control of working memory and attention

[...]

Matthew F. Panichello¹, Timothy J. Buschman¹•Institutions (1)

Princeton University¹

31 Mar 2021-Nature

TL;DR: This article showed that prefrontal cortex acts as a domain-general controller for both attention and selection in rhesus monkeys, and that attention facilitated behavior by enhancing and transforming the representation of the selected memory or attended stimulus.

...read moreread less

Abstract: Cognitive control guides behaviour by controlling what, when, and how information is represented in the brain1. For example, attention controls sensory processing; top-down signals from prefrontal and parietal cortex strengthen the representation of task-relevant stimuli2-4. A similar 'selection' mechanism is thought to control the representations held 'in mind'-in working memory5-10. Here we show that shared neural mechanisms underlie the selection of items from working memory and attention to sensory stimuli. We trained rhesus monkeys to switch between two tasks, either selecting one item from a set of items held in working memory or attending to one stimulus from a set of visual stimuli. Neural recordings showed that similar representations in prefrontal cortex encoded the control of both selection and attention, suggesting that prefrontal cortex acts as a domain-general controller. By contrast, both attention and selection were represented independently in parietal and visual cortex. Both selection and attention facilitated behaviour by enhancing and transforming the representation of the selected memory or attended stimulus. Specifically, during the selection task, memory items were initially represented in independent subspaces of neural activity in prefrontal cortex. Selecting an item caused its representation to transform from its own subspace to a new subspace used to guide behaviour. A similar transformation occurred for attention. Our results suggest that prefrontal cortex controls cognition by dynamically transforming representations to control what and when cognitive computations are engaged.

...read moreread less

282 citations

Journal Article•10.1109/TII.2020.2998818•

Visual Perception Enabled Industry Intelligence: State of the Art, Challenges and Prospects

[...]

Jiachen Yang¹, Chenguang Wang¹, Bin Jiang¹, Houbing Song², Qinggang Meng³ - Show less +1 more•Institutions (3)

Tianjin University¹, Embry-Riddle Aeronautical University, Daytona Beach², Loughborough University³

01 Mar 2021-IEEE Transactions on Industrial Informatics

TL;DR: The previous research and application of visual perception in different industrial fields such as product surface defect detection, intelligent agricultural production, intelligent driving, image synthesis, and event reconstruction are reviewed.

...read moreread less

Abstract: Visual perception refers to the process of organizing, identifying, and interpreting visual information in environmental awareness and understanding. With the rapid progress of multimedia acquisition technology, research on visual perception has been a hot topic in the academical field and industrial applications. Especially after the introduction of artificial intelligence theory, intelligent visual perception has been widely used to promote the development of industrial production towards intelligence. In this article, we review the previous research and application of visual perception in different industrial fields such as product surface defect detection, intelligent agricultural production, intelligent driving, image synthesis, and event reconstruction. The applications basically cover most of the intelligent visual perception processing technologies. Through this survey, it will provide a comprehensive reference for research on this direction. Finally, this article also summarizes the current challenges of visual perception and predicts its future development trends.

...read moreread less

219 citations

Proceedings Article•10.1109/CVPR46437.2021.00192•

Towards Long-Form Video Understanding

[...]

Chao-Yuan Wu¹, Philipp Krähenbühl¹•Institutions (1)

University of Texas at Austin¹

21 Jun 2021

TL;DR: In this article, the authors introduce a framework for modeling long-form videos and develop evaluation protocols on large-scale datasets and show that existing state-of-the-art short-term models are limited for long-term tasks.

...read moreread less

Abstract: Our world offers a never-ending stream of visual stimuli, yet today’s vision systems only accurately recognize patterns within a few seconds. These systems understand the present, but fail to contextualize it in past or future events. In this paper, we study long-form video understanding. We introduce a framework for modeling long-form videos and develop evaluation protocols on large-scale datasets. We show that existing state-of-the-art short-term models are limited for long-form tasks. A novel object-centric transformer-based video recognition architecture performs significantly better on 7 diverse tasks. It also outperforms comparable state-of-the-art on the AVA dataset.

...read moreread less

173 citations

Journal Article•10.1080/14992027.2020.1851401•

Impacts of face coverings on communication: an indirect impact of COVID-19.

[...]

Gabrielle H. Saunders¹, Iain Jackson¹, Anisa Visram¹•Institutions (1)

RMIT University¹

01 Jul 2021-International Journal of Audiology

TL;DR: Findings illustrate the need for communication-friendly face-coverings, and emphasise the need to be communication-aware when wearing a face covering, to understand the impact of face coverings on hearing and communication.

...read moreread less

Abstract: To understand the impact of face coverings on hearing and communication. An online survey consisting of closed-set and open-ended questions distributed within the UK to gain insights into experienc...

...read moreread less

170 citations

Proceedings Article•10.1109/CVPR46437.2021.01140•

ArtEmis: Affective Language for Visual Art

[...]

Panos Achlioptas¹, Maks Ovsjanikov², Kilichbek Haydarov³, Mohamed Elhoseiny³, Leonidas J. Guibas¹ - Show less +1 more•Institutions (3)

Stanford University¹, École Polytechnique², King Abdullah University of Science and Technology³

19 Jan 2021

TL;DR: ArtEmis as mentioned in this paper is a large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language.

...read moreread less

Abstract: We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicate the dominant emotion they feel for a given image and, crucially, to also provide a grounded verbal explanation for their emotion choice. As we demonstrate below, this leads to a rich set of signals for both the objective content and the affective impact of an image, creating associations with abstract concepts (e.g., "freedom" or "love"), or references that go beyond what is directly visible, including visual similes and metaphors, or subjective references to personal experiences. We focus on visual art (e.g., paintings, artistic photographs) as it is a prime example of imagery created to elicit emotional responses from its viewers. Our dataset, termed ArtEmis, contains 455K emotion attributions and explanations from humans, on 80K artworks from WikiArt. Building on this data, we train and demonstrate a series of captioning systems capable of expressing and explaining emotions from visual stimuli. Remarkably, the captions produced by these systems often succeed in reflecting the semantic and abstract content of the image, going well beyond systems trained on existing datasets. The collected dataset and developed methods are available at https://artemisdataset.org.

...read moreread less

147 citations

Journal Article•10.1016/J.CUB.2021.07.062•

Representational drift in the mouse visual cortex.

[...]

Daniel Deitch¹, Alon Rubin¹, Yaniv Ziv¹•Institutions (1)

Weizmann Institute of Science¹

24 Aug 2021-Current Biology

TL;DR: In this article, the authors analyzed large-scale optical and electrophysiological recordings from six visual cortical areas in behaving mice that were repeatedly presented with the same natural movies and found representational drift over timescales spanning minutes to days across multiple visual areas, cortical layers, and cell types.

...read moreread less

144 citations

Journal Article•10.1016/J.NEURON.2021.01.013•

Unraveling circuits of visual perception and cognition through the superior colliculus

[...]

Michele A. Basso¹, Martha E. Bickford², Jianhua Cang³•Institutions (3)

Semel Institute for Neuroscience and Human Behavior¹, University of Louisville², University of Virginia³

17 Mar 2021-Neuron

TL;DR: The superior colliculus is a conserved sensorimotor structure that integrates visual and other sensory information to drive reflexive behaviors as mentioned in this paper, and the evidence for this is strong and compelling.

...read moreread less

132 citations

Journal Article•10.1109/TCYB.2020.3035613•

Lightweight Salient Object Detection via Hierarchical Visual Perception Learning

[...]

Yun Liu¹, Yu-Chao Gu¹, Xin-Yu Zhang¹, Weiwei Wang¹, Ming-Ming Cheng¹ - Show less +1 more•Institutions (1)

Nankai University¹

01 Sep 2021-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A hierarchical visual perception (HVP) module to imitate the primate visual cortex for hierarchical perception learning is proposed, and with the HVP module incorporated, a lightweight SOD network is designed, namely, HVPNet.

...read moreread less

Abstract: Recently, salient object detection (SOD) has witnessed vast progress with the rapid development of convolutional neural networks (CNNs). However, the improvement of SOD accuracy comes with the increase in network depth and width, resulting in large network size and heavy computational overhead. This prevents state-of-the-art SOD methods from being deployed into practical platforms, especially mobile devices. To promote the deployment of real-world SOD applications, we aim at developing a lightweight SOD model in this article. Our observation comes from that the primate visual system processes visual signals hierarchically with different receptive fields and eccentricities in different visual cortex areas. Inspired by this, we propose a hierarchical visual perception (HVP) module to imitate the primate visual cortex for hierarchical perception learning. With the HVP module incorporated, we design a lightweight SOD network, namely, HVPNet. Extensive experiments on popular benchmarks demonstrate that HVPNet achieves highly competitive accuracy compared with state-of-the-art SOD methods while running at 4.3 frames/s CPU speed and 333.2 frames/s GPU speed with only 1.23M parameters.

...read moreread less

122 citations

Book•

An Introduction to Cognitive Psychology: Processes and Disorders

[...]

David H. Groome¹•Institutions (1)

University of Westminster¹

26 Aug 2021

TL;DR: This book discusses the role of attention in perception, the nature and Function of Memory, and theories of Cognition: From Metaphors to Computational Models.

...read moreread less

Abstract: Introduction to Cognitive Psychology. Cognitive Processes. Experimental Psychology. Computer Models of Information Processing. Cognitive Neuropsychology. Minds, Brains and Computers. Perception and Attention. The Biological Bases of Perception. Psychological Approaches to Visual Perception. Visual Illusions. Marr's Theory. Object Recognition Processes. Perception: A Summary. Attention. The Role of Attention in Perception. Automaticity. The Spotlight Model of Visual Attention. Visual Attention. Perception, Attention and Consciousness. Disorders of Perception and Attention. Introduction. Blindsight. Unilateral Spatial Neglect. Visual Agnosia. Disorders of Face Processing - Prosopagnosia and Related Conditions. Memory. The Nature and Function of Memory. Multistore Models and Working Memory. Ebbinghaus and the First Long-term Memory Experiments. The Role of Knowledge, Meaning, and Schemas in Memory. Input Processing and Encoding. Retrieval Cues and Feature Overlap. Retrieval Mechanisms in Recall and Recognition. Automatic and Controlled Memory Processes. Memory in Real Life. Disorders of Memory. The Tragic Effects of Amnesia. The Causes of Organic Amnesia. Short-term and Long-term Memory Impairments. Anterograde and Retrograde Amnesia. Memory Functions Preserved in Amnesia. Other Types of Amnesia. Thinking, Problem-solving and Reasoning. Introduction. Early Research on Problem-solving. Problem-space Theory of Problem-solving. Problem-solving and Knowledge. Deductive and Inductive Reasoning. Statistical Reasoning. Everyday Reasoning. Disorders of Thinking. Executive Function and the Frontal Lobes. Introduction. The frontal Lobes. Problem-solving and Reasoning Deficits. The Executive Functions of the Frontal Lobes. Language. Introduction. The Language System. Psychology and Linguistics. Recognising Spoken and Written Words. Production of Spoken Words. Sentence Comprehension. Sentence Production. Discourse Level. Disorders of Language. Introduction. Historical Perspective. The Psycholinguistic. Disruptions to Language Processing at Word Level. Disruption to Processing of Syntax. Disruption to Processing of Discourse. Theories of Cognition: From Metaphors to Computational Models. Symbol-based Systems. Connectionist Systems. Symbols and Neurons Compared.

...read moreread less

122 citations

Journal Article•10.1038/S41467-021-22244-7•

Limits to visual representational correspondence between convolutional neural networks and the human brain.

[...]

Yaoda Xu¹, Maryam Vaziri-Pashkam•Institutions (1)

Yale University¹

06 Apr 2021-Nature Communications

TL;DR: In this paper, the authors evaluate the performance of 14 different CNNs compared with human fMRI responses to natural and artificial images using representational similarity analysis, and show that CNNs do not fully capture higher level visual representations of real-world objects, nor those of artificial objects, either at lower or higher levels of visual representations.

...read moreread less

Abstract: Convolutional neural networks (CNNs) are increasingly used to model human vision due to their high object categorization capabilities and general correspondence with human brain responses Here we evaluate the performance of 14 different CNNs compared with human fMRI responses to natural and artificial images using representational similarity analysis Despite the presence of some CNN-brain correspondence and CNNs’ impressive ability to fully capture lower level visual representation of real-world objects, we show that CNNs do not fully capture higher level visual representations of real-world objects, nor those of artificial objects, either at lower or higher levels of visual representations The latter is particularly critical, as the processing of both real-world and artificial visual stimuli engages the same neural circuits We report similar results regardless of differences in CNN architecture, training, or the presence of recurrent processing This indicates some fundamental differences exist in how the brain and CNNs represent visual information Convolutional neural networks are increasingly used to model human vision Here, the authors compare the performance of 14 different CNNs and human fMRI responses to real-world and artificial objects to show some fundamental differences exist between them

...read moreread less

Journal Article•10.1021/ACSNANO.1C04676•

Artificial Visual Perception Nervous System Based on Low-Dimensional Material Photoelectric Memristors.

[...]

Yifei Pei¹, Lei Yan¹, Zuheng Wu², Jikai Lu³, Jianhui Zhao¹, Jingsheng Chen⁴, Qi Liu⁵, Xiaobing Yan¹ - Show less +4 more•Institutions (5)

Hebei University¹, Anhui University², Chinese Academy of Sciences³, National University of Singapore⁴, Fudan University⁵

20 Sep 2021-ACS Nano

TL;DR: In this article, the authors proposed a fully memristor-based artificial visual perception nervous system (AVPNS) which consists of a quantum-dot-based photoelectric memrisor and a nanosheet-based threshold-switching (TS) memrisors.

...read moreread less

Abstract: The visual perception system is the most important system for human learning since it receives over 80% of the learning information from the outside world. With the exponential growth of artificial intelligence technology, there is a pressing need for high-energy and area-efficiency visual perception systems capable of processing efficiently the received natural information. Currently, memristors with their elaborate dynamics, excellent scalability, and information (e.g., visual, pressure, sound, etc.) perception ability exhibit tremendous potential for the application of visual perception. Here, we propose a fully memristor-based artificial visual perception nervous system (AVPNS) which consists of a quantum-dot-based photoelectric memristor and a nanosheet-based threshold-switching (TS) memristor. We use a photoelectric and a TS memristor to implement the synapse and leaky integrate-and-fire (LIF) neuron functions, respectively. With the proposed AVPNS we successfully demonstrate the biological image perception, integration and fire, as well as the biosensitization process. Furthermore, the self-regulation process of a speed meeting control system in driverless automobiles can be accurately and conceptually emulated by this system. Our work shows that the functions of the biological visual nervous system may be systematically emulated by a memristor-based hardware system, thus expanding the spectrum of memristor applications in artificial intelligence.

...read moreread less

Journal Article•10.1093/NSR/NWAA172•

Networking retinomorphic sensor with memristive crossbar for brain-inspired visual perception.

[...]

Shuang Wang¹, Chenyu Wang¹, Pengfei Wang¹, Cong Wang¹, Zhuan Li¹, Chen Pan¹, Yitong Dai¹, Anyuan Gao¹, Chuan Liu¹, Jian Liu¹, Huafeng Yang¹, Xiaowei Liu¹, Bin Cheng¹, Kunji Chen¹, Zhenlin Wang¹, Kenji Watanabe², Takashi Taniguchi², Shi-Jun Liang¹, Feng Miao¹ - Show less +15 more•Institutions (2)

Nanjing University¹, National Institute for Materials Science²

10 Feb 2021-National Science Review

TL;DR: A prototype neuromorphic vision system is proposed and demonstrated by networking a retinomorphic sensor with a memristive crossbar that allows for fast letter recognition and object tracking and indicates the capabilities of image sensing, processing and recognition in the full analog regime.

...read moreread less

Abstract: Compared to human vision, conventional machine vision composed of an image sensor and processor suffers from high latency and large power consumption due to physically separated image sensing and processing. A neuromorphic vision system with brain-inspired visual perception provides a promising solution to the problem. Here we propose and demonstrate a prototype neuromorphic vision system by networking a retinomorphic sensor with a memristive crossbar. We fabricate the retinomorphic sensor by using WSe2/h-BN/Al2O3 van der Waals heterostructures with gate-tunable photoresponses, to closely mimic the human retinal capabilities in simultaneously sensing and processing images. We then network the sensor with a large-scale Pt/Ta/HfO2/Ta one-transistor-one-resistor (1T1R) memristive crossbar, which plays a similar role to the visual cortex in the human brain. The realized neuromorphic vision system allows for fast letter recognition and object tracking, indicating the capabilities of image sensing, processing and recognition in the full analog regime. Our work suggests that such a neuromorphic vision system may open up unprecedented opportunities in future visual perception applications.

...read moreread less

Journal Article•10.1073/PNAS.2011417118•

An ecologically motivated image dataset for deep learning yields better models of human vision

[...]

Johannes Mehrer¹, Courtney J. Spoerer¹, Emer C. Jones¹, Nikolaus Kriegeskorte², Tim C. Kietzmann¹ - Show less +1 more•Institutions (2)

Cognition and Brain Sciences Unit¹, Columbia University²

23 Feb 2021-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The ecoset dataset as discussed by the authors is a collection of >1.5 million images from 565 basic-level categories selected to better capture the distribution of objects relevant to humans.

...read moreread less

Abstract: Deep neural networks provide the current best models of visual information processing in the primate brain. Drawing on work from computer vision, the most commonly used networks are pretrained on data from the ImageNet Large Scale Visual Recognition Challenge. This dataset comprises images from 1,000 categories, selected to provide a challenging testbed for automated visual object recognition systems. Moving beyond this common practice, we here introduce ecoset, a collection of >1.5 million images from 565 basic-level categories selected to better capture the distribution of objects relevant to humans. Ecoset categories were chosen to be both frequent in linguistic usage and concrete, thereby mirroring important physical objects in the world. We test the effects of training on this ecologically more valid dataset using multiple instances of two neural network architectures: AlexNet and vNet, a novel architecture designed to mimic the progressive increase in receptive field sizes along the human ventral stream. We show that training on ecoset leads to significant improvements in predicting representations in human higher-level visual cortex and perceptual judgments, surpassing the previous state of the art. Significant and highly consistent benefits are demonstrated for both architectures on two separate functional magnetic resonance imaging (fMRI) datasets and behavioral data, jointly covering responses to 1,292 visual stimuli from a wide variety of object categories. These results suggest that computational visual neuroscience may take better advantage of the deep learning framework by using image sets that reflect the human perceptual and cognitive experience. Ecoset and trained network models are openly available to the research community.

...read moreread less

Journal Article•10.1109/TPAMI.2020.2995909•

Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features

[...]

Simone Palazzo¹, Concetto Spampinato¹, Isaak Kavasidis¹, Daniela Giordano¹, Joseph Schmidt², Mubarak Shah² - Show less +2 more•Institutions (2)

University of Catania¹, University of Central Florida²

01 Nov 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work proposes a model, EEG-ChannelNet, to learn a brain manifold for EEG classification and introduces a multimodal approach that uses deep image and EEG encoders, trained in a siamese configuration, for learning a joint manifold that maximizes a compatibility measure between visual features and brain representations.

...read moreread less

Abstract: This work presents a novel method of exploring human brain-visual representations, with a view towards replicating these processes in machines. The core idea is to learn plausible computational and biological representations by correlating human neural activity and natural images. Thus, we first propose a model, EEG-ChannelNet , to learn a brain manifold for EEG classification. After verifying that visual information can be extracted from EEG data, we introduce a multimodal approach that uses deep image and EEG encoders, trained in a siamese configuration, for learning a joint manifold that maximizes a compatibility measure between visual features and brain representations. We then carry out image classification and saliency detection on the learned manifold. Performance analyses show that our approach satisfactorily decodes visual information from neural signals. This, in turn, can be used to effectively supervise the training of deep learning models, as demonstrated by the high performance of image classification and saliency detection on out-of-training classes. The obtained results show that the learned brain-visual features lead to improved performance and simultaneously bring deep models more in line with cognitive neuroscience work related to visual perception and attention.

...read moreread less

Journal Article•10.1167/JOV.21.1.2•

Asymmetries in visual acuity around the visual field

[...]

Antoine Barbot¹, Shutian Xue¹, Marisa Carrasco², Marisa Carrasco¹•Institutions (2)

New York University¹, Center for Neural Science²

04 Jan 2021-Journal of Vision

TL;DR: In this paper, the authors measured visual acuity at isoeccentric peripheral locations (10 deg eccentricity), every 15° of polar angle, on each trial, observers judged the orientation (± 45°) of one of four equidistant, suprathreshold grating stimuli varying in spatial frequency (SF).

...read moreread less

Abstract: Human vision is heterogeneous around the visual field. At a fixed eccentricity, performance is better along the horizontal than the vertical meridian and along the lower than the upper vertical meridian. These asymmetric patterns, termed performance fields, have been found in numerous visual tasks, including those mediated by contrast sensitivity and spatial resolution. However, it is unknown whether spatial resolution asymmetries are confined to the cardinal meridians or whether and how far they extend into the upper and lower hemifields. Here, we measured visual acuity at isoeccentric peripheral locations (10 deg eccentricity), every 15° of polar angle. On each trial, observers judged the orientation (± 45°) of one of four equidistant, suprathreshold grating stimuli varying in spatial frequency (SF). On each block, we measured performance as a function of stimulus SF at 4 of 24 isoeccentric locations. We estimated the 75%-correct SF threshold, SF cutoff point (i.e., chance-level), and slope of the psychometric function for each location. We found higher SF estimates (i.e., better acuity) for the horizontal than the vertical meridian and for the lower than the upper vertical meridian. These asymmetries were most pronounced at the cardinal meridians and decreased gradually as the angular distance from the vertical meridian increased. This gradual change in acuity with polar angle reflected a shift of the psychometric function without changes in slope. The same pattern was found under binocular and monocular viewing conditions. These findings advance our understanding of visual processing around the visual field and help constrain models of visual perception.

...read moreread less

Journal Article•10.1016/J.BSPC.2020.102174•

An Attention-based Bi-LSTM Method for Visual Object Classification via EEG

[...]

Xiao Zheng¹, Wanzhong Chen¹•Institutions (1)

Jilin University¹

01 Jan 2021-Biomedical Signal Processing and Control

TL;DR: The experimental results not only could provide strong support for the modularity theory about the brain cognitive function, but show the superiority of the proposed Bi-LSTM model with attention mechanism again.

...read moreread less

Journal Article•10.1007/S10648-020-09565-7•

Eye-Tracking in Educational Practice: Investigating Visual Perception Underlying Teaching and Learning in the Classroom.

[...]

Halszka Jarodzka¹, Irene T. Skuballa¹, Hans Gruber², Hans Gruber³•Institutions (3)

Open University in the Netherlands¹, University of Regensburg², University of Turku³

01 Mar 2021-Educational Psychology Review

TL;DR: In this paper, six empirical studies present examples of how to capture visual perception in the complexity of a classroom lesson, and one theoretical contribution provides the very first model of teachers' cognitions during teaching in relation to their visual perception, which in turn will allow future research to move beyond explorations towards hypothesis testing.

...read moreread less

Abstract: Classrooms full of pupils can be very overwhelming, both for teachers and students, as well as for their joint interactions. It is thus crucial that both can distil the relevant information in this complex scenario and interpret it appropriately. This distilling and interpreting happen to a large extent via visual perception, which is the core focus of the current Special Issue. Six empirical studies present examples of how to capture visual perception in the complexity of a classroom lesson. These examples open up new avenues that go beyond studying perception in restricted and artificial laboratory scenarios: some using video recordings from authentic lessons to others studying actual classrooms. This movement towards more realistic scenarios allows to study the visual perception in classrooms from new perspectives, namely that of the teachers, the learners, and their interactions. This in turn enables to shed novel light onto well-established theoretical concepts, namely students’ engagement during actual lessons, teachers’ professional vision while teaching, and establishment of joint attention between teachers and students in a lesson. Additionally, one theoretical contribution provides the very first model of teachers’ cognitions during teaching in relation to their visual perception, which in turn will allow future research to move beyond explorations towards hypothesis testing. However, to fully thrive, this field of research has to address two crucial challenges: (i) the heterogeneity of its methodological approaches (e.g., varying age groups, subjects taught, lesson formats) and (ii) the recording and processing of personal data of many people (often minors). Hence, these new approaches bear not only new chances for insights but also new responsibilities for the researchers.

...read moreread less

Journal Article•10.1038/S41562-021-01124-6•

Aesthetic preference for art can be predicted from a mixture of low- and high-level visual features

[...]

Kiyohito Iigaya¹, Sanghyun Yi¹, Iman A. Wahle¹, Koranis Tanwisuth¹, Koranis Tanwisuth², John P. O'Doherty¹ - Show less +2 more•Institutions (2)

California Institute of Technology¹, University of California, Berkeley²

20 May 2021-Nature Human Behaviour

TL;DR: In this article, the authors developed and tested a computational framework to investigate how aesthetic values are formed, and they showed that it is possible to explain human preferences for a visual art piece based on a mixture of low and high-level features of the image.

...read moreread less

Abstract: It is an open question whether preferences for visual art can be lawfully predicted from the basic constituent elements of a visual image. Here, we developed and tested a computational framework to investigate how aesthetic values are formed. We show that it is possible to explain human preferences for a visual art piece based on a mixture of low- and high-level features of the image. Subjective value ratings could be predicted not only within but also across individuals, using a regression model with a common set of interpretable features. We also show that the features predicting aesthetic preference can emerge hierarchically within a deep convolutional neural network trained only for object recognition. Our findings suggest that human preferences for art can be explained at least in part as a systematic integration over the underlying visual features of an image.

...read moreread less

Journal Article•10.1016/J.TICS.2021.01.006•

The Perception of Relations.

[...]

Alon Hafri¹, Chaz Firestone¹•Institutions (1)

Johns Hopkins University¹

01 Jun 2021-Trends in Cognitive Sciences

TL;DR: The authors showed that even very sophisticated relations display key signatures of automatic visual processing, such as support, fit, cause, chase, and even socially interact, revealing surprisingly rich content in visual perception itself.

...read moreread less

Journal Article•10.1038/S41467-021-21979-7•

Top-down control of visual cortex by the frontal eye fields through oscillatory realignment.

[...]

Domenica Veniero¹, Joachim Gross², Stéphanie Morand³, Felix Duecker⁴, Alexander T. Sack⁴, Gregor Thut³ - Show less +2 more•Institutions (4)

University of Nottingham¹, University of Münster², University of Glasgow³, Maastricht University⁴

19 Mar 2021-Nature Communications

TL;DR: In this paper, the authors showed that top-down signals originating in the frontal eye fields causally shape visual cortex activity and perception through mechanisms of oscillatory phase realignment at the beta frequency.

...read moreread less

Abstract: Voluntary allocation of visual attention is controlled by top-down signals generated within the Frontal Eye Fields (FEFs) that can change the excitability of lower-level visual areas. However, the mechanism through which this control is achieved remains elusive. Here, we emulated the generation of an attentional signal using single-pulse transcranial magnetic stimulation to activate the FEFs and tracked its consequences over the visual cortex. First, we documented changes to brain oscillations using electroencephalography and found evidence for a phase reset over occipital sites at beta frequency. We then probed for perceptual consequences of this top-down triggered phase reset and assessed its anatomical specificity. We show that FEF activation leads to cyclic modulation of visual perception and extrastriate but not primary visual cortex excitability, again at beta frequency. We conclude that top-down signals originating in FEF causally shape visual cortex activity and perception through mechanisms of oscillatory realignment. Visual attention requires top-down modulation from the frontal eye fields to change cortical excitability of visual cortex. Here, the authors show that these top-down signals shape perception through mechanisms of oscillatory phase realignment at the beta frequency.

...read moreread less

Journal Article•10.1109/TAFFC.2018.2887267•

Cross-Cultural and Cultural-Specific Production and Perception of Facial Expressions of Emotion in the Wild

[...]

Ramprakash Srinivasan¹, Aleix M. Martinez¹•Institutions (1)

Ohio State University¹

01 Jul 2021-IEEE Transactions on Affective Computing

TL;DR: This paper conducted a large-scale study of the production and visual perception of facial expressions of emotion in the wild and found that of the 16,384 possible facial configurations that people can theoretically produce, only 35 were successfully used to transmit emotive information across cultures, and only 8 within a smaller number of cultures.

...read moreread less

Abstract: Automatic recognition of emotion from facial expressions is an intense area of research, with a potentially long list of important application. Yet, the study of emotion requires knowing which facial expressions are used within and across cultures in the wild, not in controlled lab conditions; but such studies do not exist. Which and how many cross-cultural and cultural-specific facial expressions do people commonly use? And, what affect variables does each expression communicate to observers? If we are to design technology that understands the emotion of users, we need answers to these two fundamental questions. In this paper, we present the first large-scale study of the production and visual perception of facial expressions of emotion in the wild. We find that of the 16,384 possible facial configurations that people can theoretically produce, only 35 are successfully used to transmit emotive information across cultures, and only 8 within a smaller number of cultures. Crucially, we find that visual analysis of cross-cultural expressions yields consistent perception of emotion categories and valence, but not arousal. In contrast, visual analysis of cultural-specific expressions yields consistent perception of valence and arousal, but not of emotion categories. Additionally, we find that the number of expressions used to communicate each emotion is also different, e.g., 17 expressions transmit happiness, but only 1 is used to convey disgust.

...read moreread less

Journal Article•10.1038/S41467-021-24368-2•

Object representations in the human brain reflect the co-occurrence statistics of vision and language.

[...]

Michael F. Bonner¹, Russell A. Epstein²•Institutions (2)

Johns Hopkins University¹, University of Pennsylvania²

02 Jul 2021-Nature Communications

TL;DR: This article found that cortical responses to single objects were predicted by the statistical ensembles in which they typically occur, and that this link between objects and their visual contexts was made most strongly in parahippocampal cortex, overlapping with the anterior portion of scene-selective para-paraphrasing place area.

...read moreread less

Abstract: A central regularity of visual perception is the co-occurrence of objects in the natural environment. Here we use machine learning and fMRI to test the hypothesis that object co-occurrence statistics are encoded in the human visual system and elicited by the perception of individual objects. We identified low-dimensional representations that capture the latent statistical structure of object co-occurrence in real-world scenes, and we mapped these statistical representations onto voxel-wise fMRI responses during object viewing. We found that cortical responses to single objects were predicted by the statistical ensembles in which they typically occur, and that this link between objects and their visual contexts was made most strongly in parahippocampal cortex, overlapping with the anterior portion of scene-selective parahippocampal place area. In contrast, a language-based statistical model of the co-occurrence of object names in written text predicted responses in neighboring regions of object-selective visual cortex. Together, these findings show that the sensory coding of objects in the human brain reflects the latent statistics of object context in visual and linguistic experience. When people view an object, they can often guess the setting from which it was drawn and the other objects that might be found in that setting. Here the authors identify regions of the human visual system that represent this information about which objects tend to appear together in the world.

...read moreread less

Journal Article•10.1155/2021/5541134•

Deep CNN and Deep GAN in Computational Visual Perception-Driven Image Analysis

[...]

R. Nandhini Abirami¹, P. M. Durai Raj Vincent¹, Kathiravan Srinivasan¹, Usman Tariq², Chuan-Yu Chang³ - Show less +1 more•Institutions (3)

VIT University¹, Salman bin Abdulaziz University², National Yunlin University of Science and Technology³

15 Apr 2021-Complexity

TL;DR: A critical review of the related significant aspects is provided and an overview of existing applications of deep learning in computational visual perception is included, which shows that there is a significant improvement in the accuracy using dropout and data augmentation.

...read moreread less

Abstract: Computational visual perception, also known as computer vision, is a field of artificial intelligence that enables computers to process digital images and videos in a similar way as biological vision does. It involves methods to be developed to replicate the capabilities of biological vision. The computer vision’s goal is to surpass the capabilities of biological vision in extracting useful information from visual data. The massive data generated today is one of the driving factors for the tremendous growth of computer vision. This survey incorporates an overview of existing applications of deep learning in computational visual perception. The survey explores various deep learning techniques adapted to solve computer vision problems using deep convolutional neural networks and deep generative adversarial networks. The pitfalls of deep learning and their solutions are briefly discussed. The solutions discussed were dropout and augmentation. The results show that there is a significant improvement in the accuracy using dropout and data augmentation. Deep convolutional neural networks’ applications, namely, image classification, localization and detection, document analysis, and speech recognition, are discussed in detail. In-depth analysis of deep generative adversarial network applications, namely, image-to-image translation, image denoising, face aging, and facial attribute editing, is done. The deep generative adversarial network is unsupervised learning, but adding a certain number of labels in practical applications can improve its generating ability. However, it is challenging to acquire many data labels, but a small number of data labels can be acquired. Therefore, combining semisupervised learning and generative adversarial networks is one of the future directions. This article surveys the recent developments in this direction and provides a critical review of the related significant aspects, investigates the current opportunities and future challenges in all the emerging domains, and discusses the current opportunities in many emerging fields such as handwriting recognition, semantic mapping, webcam-based eye trackers, lumen center detection, query-by-string word, intermittently closed and open lakes and lagoons, and landslides.

...read moreread less

Journal Article•10.1016/J.NEURON.2021.04.017•

Visual intracortical and transthalamic pathways carry distinct information to cortical areas

[...]

Antonin Blot¹, Morgane M. Roth², Ioana T Gasler¹, Ioana T Gasler², Mitra Javadzadeh², Mitra Javadzadeh¹, Fabia Imhof², Sonja B. Hofer², Sonja B. Hofer¹ - Show less +5 more•Institutions (2)

University College London¹, University of Basel²

16 Jun 2021-Neuron

TL;DR: This paper found that responses of mouse lateral posterior nucleus (LP) neurons projecting to higher visual areas likely derive from feedforward input from primary visual cortex (V1) combined with information from many cortical and subcortical areas, including superior colliculus.

...read moreread less

Journal Article•10.1167/JOV.21.3.16•

Five Points to Check when Comparing Visual Perception in Humans and Machines

[...]

Christina M. Funke¹, Judy Borowski¹, Karolina Stosio, Wieland Brendel, Thomas S. A. Wallis², Thomas S. A. Wallis¹, Matthias Bethge - Show less +3 more•Institutions (2)

University of Tübingen¹, Amazon.com²

01 Mar 2021-Journal of Vision

TL;DR: In this article, the authors present a checklist for comparative studies of visual reasoning in humans and machines, highlighting how to overcome potential pitfalls in design and inference and highlight the importance of aligning experimental conditions.

...read moreread less

Abstract: With the rise of machines to human-level performance in complex recognition tasks, a growing amount of work is directed toward comparing information processing in humans and machines. These studies are an exciting chance to learn about one system by studying the other. Here, we propose ideas on how to design, conduct, and interpret experiments such that they adequately support the investigation of mechanisms when comparing human and machine perception. We demonstrate and apply these ideas through three case studies. The first case study shows how human bias can affect the interpretation of results and that several analytic tools can help to overcome this human reference point. In the second case study, we highlight the difference between necessary and sufficient mechanisms in visual reasoning tasks. Thereby, we show that contrary to previous suggestions, feedback mechanisms might not be necessary for the tasks in question. The third case study highlights the importance of aligning experimental conditions. We find that a previously observed difference in object recognition does not hold when adapting the experiment to make conditions more equitable between humans and machines. In presenting a checklist for comparative studies of visual reasoning in humans and machines, we hope to highlight how to overcome potential pitfalls in design and inference.

...read moreread less

Journal Article•10.1177/0018720819892383•

Temperature-Color Interaction: Subjective Indoor Environmental Perception and Physiological Responses in Virtual Reality.

[...]

Giorgia Chinazzo¹, Kynthia Chamilothori¹, Jan Wienold¹, Marilyne Andersen¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 May 2021-Human Factors

TL;DR: In the VR setting, the orange daylight led to warmer thermal perception in (close-to-) comfortable temperatures, resulting in a color-induced thermal perception and indicating that orange glazing should be used with caution in a slightly warm environment.

...read moreread less

Abstract: ObjectiveTemperature–color interaction effects on subjective perception and physiological responses are investigated using a novel hybrid experimental method combining thermal and visual stimuli fr...

...read moreread less

Journal Article•10.1073/PNAS.2017032118•

Topographic connectivity reveals task-dependent retinotopic processing throughout the human brain.

[...]

Tomas Knapen

12 Jan 2021-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Widespread stable visual organization beyond the traditional visual system, in default-mode network and hippocampus is demonstrated, indicating that visual–spatial organization is a fundamental coding principle that structures the communication between distant brain regions.

...read moreread less

Abstract: The human visual system is organized as a hierarchy of maps that share the topography of the retina. Known retinotopic maps have been identified using simple visual stimuli under strict fixation, conditions different from everyday vision which is active, dynamic, and complex. This means that it remains unknown how much of the brain is truly visually organized. Here I demonstrate widespread stable visual organization beyond the traditional visual system, in default-mode network and hippocampus. Detailed topographic connectivity with primary visual cortex during movie-watching, resting-state, and retinotopic-mapping experiments revealed that visual-spatial representations throughout the brain are warped by cognitive state. Specifically, traditionally visual regions alternate with default-mode network and hippocampus in preferentially representing the center of the visual field. This visual role of default-mode network and hippocampus would allow these regions to interface between abstract memories and concrete sensory impressions. Together, these results indicate that visual-spatial organization is a fundamental coding principle that structures the communication between distant brain regions.

...read moreread less

Journal Article•10.1038/S41380-021-01090-5•

Reduction of higher-order occipital GABA and impaired visual perception in acute major depressive disorder.

[...]

Xue Mei Song¹, Xi Wen Hu¹, Zhe Li¹, Yuan Gao¹, Xuan Ju¹, Dong Yu Liu¹, Qian Nan Wang¹, Chuang Xue¹, Yong-Chun Cai¹, Ruiliang Bai¹, Zhong Lin Tan¹, Georg Northoff¹, Georg Northoff² - Show less +9 more•Institutions (2)

Zhejiang University¹, University of Ottawa²

16 Apr 2021-Molecular Psychiatry

TL;DR: In this article, the authors used high-field 7T proton Magnetic Resonance Spectroscopy (1H-MRS) to study the effect of reduced occipital GABA on visual perception and symptom severity in acute major depressive disorder.

...read moreread less

Abstract: Major depressive disorder (MDD) is a complex state-dependent psychiatric illness for which biomarkers linking psychophysical, biochemical, and psychopathological changes remain yet elusive, though. Earlier studies demonstrate reduced GABA in lower-order occipital cortex in acute MDD leaving open its validity and significance for higher-order visual perception, though. The goal of our study is to fill that gap by combining psychophysical investigation of visual perception with measurement of GABA concentration in middle temporal visual area (hMT+) in acute depressed MDD. Psychophysically, we observe a highly specific deficit in visual surround motion suppression in a large sample of acute MDD subjects which, importantly, correlates with symptom severity. Both visual deficit and its relation to symptom severity are replicated in the smaller MDD sample that received MRS. Using high-field 7T proton Magnetic resonance spectroscopy (1H-MRS), acute MDD subjects exhibit decreased GABA concentration in visual MT+ which, unlike in healthy subjects, no longer correlates with their visual motion performance, i.e., impaired SI. In sum, our combined psychophysical-biochemical study demonstrates an important role of reduced occipital GABA for altered visual perception and psychopathological symptoms in acute MDD. Bridging the gap from the biochemical level of occipital GABA over visual-perceptual changes to psychopathological symptoms, our findings point to the importance of the occipital cortex in acute depressed MDD including its role as candidate biomarker.

...read moreread less

Journal Article•10.1523/JNEUROSCI.2098-20.2021•

The Human Brain Encodes a Chronicle of Visual Events at Each Instant of Time Through the Multiplexing of Traveling Waves.

[...]

Jean-Rémi King¹, Jean-Rémi King², Valentin Wyart³•Institutions (3)

New York University¹, Frankfurt Institute for Advanced Studies², French Institute of Health and Medical Research³

02 Apr 2021-The Journal of Neuroscience

TL;DR: In this article, the brain simultaneously represents multiple successive images at each time instant by multiplexing them along a neural cascade, which can be explained by a hierarchy of neural assemblies that continuously propagate multiple visual contents.

...read moreread less

Abstract: The human brain continuously processes streams of visual input. Yet, a single image typically triggers neural responses that extend beyond 1s. To understand how the brain encodes and maintains successive images, we analyzed with electroencephalography the brain activity of human subjects while they watched ∼5000 visual stimuli presented in fast sequences. First, we confirm that each stimulus can be decoded from brain activity for ∼1s, and we demonstrate that the brain simultaneously represents multiple images at each time instant. Second, we source localize the corresponding brain responses in the expected visual hierarchy and show that distinct brain regions represent, at each time instant, different snapshots of past stimulations. Third, we propose a simple framework to further characterize the dynamical system of these traveling waves. Our results show that a chain of neural circuits, which each consist of (1) a hidden maintenance mechanism and (2) an observable update mechanism, accounts for the dynamics of macroscopic brain representations elicited by visual sequences. Together, these results detail a simple architecture explaining how successive visual events and their respective timings can be simultaneously represented in the brain.SIGNIFICANCE STATEMENT Our retinas are continuously bombarded with a rich flux of visual input. Yet, how our brain continuously processes such visual streams is a major challenge to neuroscience. Here, we developed techniques to decode and track, from human brain activity, multiple images flashed in rapid succession. Our results show that the brain simultaneously represents multiple successive images at each time instant by multiplexing them along a neural cascade. Dynamical modeling shows that these results can be explained by a hierarchy of neural assemblies that continuously propagate multiple visual contents. Overall, this study sheds new light on the biological basis of our visual experience.

...read moreread less

...

Expand