TL;DR: This article showed that human observers fail to detect changes to objects and object properties when localized retinal information signaling a change is masked or eliminated (e.g., by eye movements) by creating short motion pictures in which objects in both arbitrary locations and the very center of attention were changed.
Abstract: Our intuition that we richly represent the visual details of our environment is illusory. When viewing a scene, we seem to use detailed representations of object properties and interobject relations to achieve a sense of continuity across views. Yet, several recent studies show that human observers fail to detect changes to objects and object properties when localized retinal information signaling a change is masked or eliminated (e.g., by eye movements). However, these studies changed arbitrarily chosen objects which may have been outside the focus of attention. We draw on previous research showing the importance of spatiotemporal information for tracking objects by creating short motion pictures in which objects in both arbitrary locations and the very center of attention were changed. Adult observers failed to notice changes in both cases, even when the sole actor in a scene transformed into another person across an instantaneous change in camera angle (or “cut”).
TL;DR: The problem of rerouting interobject references on the creation of new versions is solved by providing generic references and user-specific environments and Logical version clusters are introduced that allow for the meaningful grouping of versions.
Abstract: In engineering applications, multiple copies of object descriptions have to coexist in a single database. A scheme is proposed that enables users to explicitly deal with these object versions. After introducing a basic version model, the problem of rerouting interobject references on the creation of new versions is solved by providing generic references and user-specific environments. Logical version clusters are introduced that allow for the meaningful grouping of versions. Some remarks on implementation and a comparison with other approaches are also included. >
TL;DR: The time subjects took to scan between objects in a mental image was used to infer the sorts of geometric information that images preserve, and it is argued that imagery and perception share some representational structures but that mental image scanning is a process distinct from eye movements or eye-movement commands.
Abstract: What sort of medium underlies imagery for three-dimensional scenes? In the present investigation, the time subjects took to scan between objects in a mental image was used to infer the sorts of geometric information that images preserve. Subjects studied an open box in which five objects were suspended, and learned to imagine this display with their eyes closed. In the first experiment, subjects scanned by tracking an imaginary point moving in a straight line between the imagined objects. Scanning times increased linearly with increasing distance between objects in three dimensions. Therefore metric 3-D information must be preserved in images, and images cannot simply be 2-D "snapshots." In a second experiment, subjects scanned across the image by "sighting" objects through an imaginary rifle sight. Here scanning times were found to increase linearly with the two-dimensional separations between objects as they appeared from the original viewing angle. Therefore metric 2-D distance information in the original perspective view must be preserved in images, and images cannot simply be 3-D "scale-models" that are assessed from any and all directions at once. In a third experiment, subjects mentally rotated the display 90 degrees and scanned between objects as they appeared in this new perspective view by tracking an imaginary rifle signt, as before. Scanning times increased linearly with the two-dimensional separations between objects as they would appear from the new relative viewing perspective. Therefore images can display metric 2-D distance information in a perspective view never actually experiences, so mental images cannot simply be "snapshot plus scale model" pairs. These results can be explained by a model in which the three-dimensional structure of objects is encoded in long-term memory in 3-D object-centered coordinate systems. When these objects are imagined, this information is then mapped onto a single 2-D "surface display" in which the perspective properties specific to a given viewing angle can be depicted. In a set of perceptual control experiments, subjects scanned a visible display by (a) simply moving their eyes from one object to another, (b) sweeping an imaginary rifle sight over the display, or (c) tracking an imaginary point moving from one object to another. Eye-movement times varied linearly with 2-D interobject distance, as did time to scan with an imaginary rifle sight; time to tract a point varied independently with the 3-D and 2-D interobject distances. These results are compared with the analogous image scanning results to argue that imagery and perception share some representational structures but that mental image scanning is a process distinct from eye movements or eye-movement commands.
TL;DR: Functional MRI and behavioral measures are used to show that the attentional benefit of grouping extends to higher-level grouping based on the relative position of objects as experienced in the real world, which indicates that the visual system can exploit real-world regularities to group objects that typically co-occur.
Abstract: In virtually every real-life situation humans are confronted with complex and cluttered visual environments that contain a multitude of objects. Because of the limited capacity of the visual system, objects compete for neural representation and cognitive processing resources. Previous work has shown that such attentional competition is partly object based, such that competition among elements is reduced when these elements perceptually group into an object based on low-level cues. Here, using functional MRI (fMRI) and behavioral measures, we show that the attentional benefit of grouping extends to higher-level grouping based on the relative position of objects as experienced in the real world. An fMRI study designed to measure competitive interactions among objects in human visual cortex revealed reduced neural competition between objects when these were presented in commonly experienced configurations, such as a lamp above a table, relative to the same objects presented in other configurations. In behavioral visual search studies, we then related this reduced neural competition to improved target detection when distracter objects were shown in regular configurations. Control studies showed that low-level grouping could not account for these results. We interpret these findings as reflecting the grouping of objects based on higher-level spatial-relational knowledge acquired through a lifetime of seeing objects in specific configurations. This interobject grouping effectively reduces the number of objects that compete for representation and thereby contributes to the efficiency of real-world perception.
TL;DR: For instance, this article found that preoperational children can form schemata to represent organized scenes (Stage 1), but it is not until the emergence of concrete operations that these schema become operational with respect to guiding the further processing of information in the scene (Stage 2).
Abstract: Recognition memory for previously seen multiobject scenes was examined for different types of contextual arrangements between objects in the scenes. It was found that organized scenes with novel but possible interobject relations were recognized more accurately than either organized scenes with familiar interobject relations or unorganized scenes with impossible interobject relations. This finding was obtained for adults, 8- to 10-year-old children, and 5- to 8-year-old children who indicated concrete-operational ability in Piaget’s conservation-of-liquid quantity task. The results were interpreted in conjunction with a two-stage model of scene processing involving the formation of a schema to represent a scene (Stage 1), and the operation of the schema in governing the further processing of detailed information in the scene (Stage 2). It was concluded that preoperational children can form schemata to represent organized scenes (Stage 1), but it is not until the emergence of concrete operations that these schemata become operational with respect to guiding the further processing of information in the scene (Stage 2).