Object semantics are theorized to play a central role in where we look in real-world scenes, but are poorly understood because they are hard to quantify. Here we tested the role of object semantics by combining a computational vector space model of semantics with eye tracking in real-world scenes. The results provide evidence that humans use their stored semantic representations of objects to help selectively process complex visual scenes, a theoretically important finding with implications for models in a wide range of areas including cognitive science, linguistics, computer vision, and visual neuroscience.
How do we determine where to focus our attention in real-world scenes? Image saliency theory proposes that our attention is ‘pulled’ to scene regions that differ in low-level image features. However, models that formalize image saliency theory often contain significant scene-independent spatial biases. In the present studies, three different viewing tasks were used to evaluate whether image saliency models account for variance in scene fixation density based primarily on scene-dependent, low-level feature contrast, or on their scene-independent spatial biases.
Recent evidence suggests that overt attention in scenes is primarily guided by semantic features. Here we examined whether the attentional priority given to meaningful scene regions is involuntary. Participants completed a scene-independent visual search task in which they searched for superimposed letter targets whose locations were orthogonal to both the underlying scene semantics and image salience. The results showed that even when the task was completely independent from the scene semantics and image salience, semantics explained significantly more variance in attention than image salience and more than expected by chance. This suggests that salient image features were effectively suppressed in favor of task goals, but semantic features were not suppressed.
Real-world scenes comprise a blooming, buzzing confusion of information. To manage this complexity, visual attention is guided to important scene regions in real time. What factors guide attention within scenes? A leading theoretical position suggests that visual salience based on semantically uninterpreted image features plays the critical causal role in attentional guidance, with knowledge and meaning playing a secondary or modulatory role. Here we propose instead that meaning plays the dominant role in guiding human attention through scenes.
The relationship between scan patterns and viewer individual differences during scene viewing remains poorly understood because scan patterns are difficult to analyze. The present study uses a powerful technique called Successor Representation Scanpath Analysis (SRSA, Hayes, Petrov, & Sederberg, 2011, 2015) to quantify the strength of the association between individual differences in scan patterns during real-world scene viewing and individual differences in viewer intelligence, working memory capacity, and speed of processing.
Pupil size is correlated with a wide variety of important cognitive variables and is increasingly being used by cognitive scientists. One serious confound that is often not properly controlled is pupil foreshortening error (PFE)—the foreshortening of the pupil image as the eye rotates away from the camera. Here we systematically map PFE using an artificial eye model and then apply a geometric model correction.
Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training-now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains.