Optimality and Limitations of Audio-Visual Integration for Cognitive Systems.
William Paul Boyce,Anthony Lindsay,Arkady Zgonnikov,Iñaki Rañó,KongFatt Wong-Lin +4 more
- 17 Jul 2020
- Vol. 7, pp 94
TL;DR: In this paper, the authors review audio-visual facilitations and illusions that are products of multisensory integration and the computational models that account for these phenomena, and suggest that more studies should be needed to detect and mitigate these illusions, as artifacts in artificial cognitive systems.
read more
Abstract: Multimodal integration is an important process in perceptual decision-making. In humans, this process has often been shown to be statistically optimal, or near optimal: sensory information is combined in a fashion that minimizes the average error in perceptual representation of stimuli. However, sometimes there are costs that come with the optimization, manifesting as illusory percepts. We review audio-visual facilitations and illusions that are products of multisensory integration, and the computational models that account for these phenomena. In particular, the same optimal computational model can lead to illusory percepts, and we suggest that more studies should be needed to detect and mitigate these illusions, as artifacts in artificial cognitive systems. We provide cautionary considerations when designing artificial cognitive systems with the view of avoiding such artifacts. Finally, we suggest avenues of research toward solutions to potential pitfalls in system design. We conclude that detailed understanding of multisensory integration and the mechanisms behind audio-visual illusions can benefit the design of artificial cognitive systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Attentional modulations of audiovisual interactions in apparent motion: Temporal ventriloquism effects on perceived visual speed
Aysun Duyar,Andrea Pavan,Hulusi Kafaligonul +2 more
- 22 Aug 2022
TL;DR: In this paper , a set of audiovisual stimuli were used to elicit temporal ventriloquism in visual apparent motion and asked participants to perform a speed comparison task.
3
Neural Circuit with Top-Down Inhibitory Feedback Outperforms Optimal Bayesian Integration in Multisensory Integration
Yelin Dong,Hongzhi You,Yuxiu Shao,Yong Gu,KongFatt Wong-Lin,Da-Hui Wang +5 more
Beyond Conversational Discourse: A Framework for Collaborative Dialogue Analysis
Qiang Li,Zhibo Zhang,Zijin Liu,Qianyu Mai,Wenxia Qiao,Ming-juan Ma +5 more
- 17 Oct 2023
TL;DR: A framework for deep audio-visual dialogue analysis in the Collaborative Working Environment (CWE) is constructed based on the TM-CTC model (CTC Transformer) and FaceNet algorithm and the results show that the proposed framework can improve communication efficiency in team collaboration to a certain extent.
Toward multimodal virtual communication
Michal Zoller
TL;DR: This paper explores multimodality in virtual communication, integrating media theory and multisensorial human experience to develop a new understanding of interpretation in immersive virtual environments, highlighting changes and challenges in stylistic analysis.
References
Demand Characteristics Confound the Rubber Hand Illusion
Peter Lush
- 07 Apr 2020
TL;DR: In this article, a quasi-experiment design was employed to test demand characteristics in rubber hand illusion reports and recorded expectancies for standard ‘illusion’ and ‘control’ statements in synchronous and asynchronous conditions.
A graphical model for audiovisual object tracking
TL;DR: A new approach to modeling and processing multimedia data based on graphical models that combine audio and video variables is presented, and a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera is developed.
Fooling the eyes: the influence of a sound-induced visual motion illusion on eye movements.
TL;DR: It is found that illusory motion perception modulated by an auditory context consistently affected saccadic eye movements and is consistent with arguments for a tight link between perception and action in localization tasks.
The sound-induced flash illusion reveals dissociable age-related effects in multisensory integration.
TL;DR: A surprising difference between sound-induced fission and fusion in older adults suggests dissociable age-related effects in multisensory integration, consistent with the idea that these illusions are mediated by distinct neural mechanisms.
Temporal Order is Coded Temporally in the Brain: Early Event-related Potential Latency Shifts Underlying Prior Entry in a Cross-modal Temporal Order Judgment Task
TL;DR: The results indicate that attention can indeed speed up neural processes during visual perception, thereby providing the first electrophysiological support for the existence of prior entry.