Taylor R. Hayes, PhD
Taylor R. Hayes, PhD
Home
Featured
Publications
2
Scene inversion reveals distinct patterns of attention to semantically interpreted and uninterpreted features
We presented real-world scenes in upright and inverted orientations and used general linear mixed effects models to understand how semantic guidance, image guidance, and observer center bias were associated with fixation location and fixation duration. We observed distinct patterns of change under inversion. Semantic guidance was severely disrupted by scene inversion, while image guidance was mildly impaired and observer center bias was enhanced. In addition, we found that fixation durations for semantically rich regions decreased when viewing inverted scenes, while fixation durations for image salience and center bias were unaffected by inversion. Together these results provide important new constraints on theories and computational models of attention in real-world scenes.
Hayes T.R.
,
Henderson J.M.
PDF
Cite
DOI
Searching for meaning: Local scene semantics guide attention during natural visual search in scenes
Models of visual search in scenes include image salience as a source of attentional guidance. However, because scene meaning is …
Peacock C.E.
,
Singh P.
,
Hayes T.R.
,
Rehrig G.
,
Henderson J.M.
PDF
Cite
DOI
Meaning maps detect the removal of local semantic scene content but deep saliency models do not
Meaning mapping uses human raters to estimate different semantic features in scenes, and has been a useful tool in demonstrating the important role semantics play in guiding attention. However, recent work has argued that meaning maps do not capture semantic content, but like deep learning models of scene attention, represent only semantically-neutral image features. In the present study, we directly tested this hypothesis using a diffeomorphic image transformation that is designed to remove the meaning of an image region while preserving its image features. The results were clear: meaning maps generated by human raters showed a large decrease in the diffeomorphed scene regions, while all three deep saliency models showed a moderate increase in the diffeomorphed scene regions. These results demonstrate that meaning maps reflect local semantic content in scenes while deep saliency models do something else.
Hayes T.R.
,
Henderson J.M.
PDF
Cite
DOI
Rapid extraction of the spatial distribution of physical saliency and semantic informativeness from natural scenes in the human brain
Attention may be attracted by physically salient objects, such as flashing lights, but humans must also be able to direct their attention to meaningful parts of scenes. Understanding how we direct attention to meaningful scene regions will be important for developing treatments for disorders of attention and for designing roadways, cockpits, and computer user interfaces. Information about saliency appears to be extracted rapidly by the brain, but little is known about the mechanisms that deter- mine the locations of meaningful information. To address this gap, we showed people photographs of real-world scenes and measured brain activity. We found that information related to the locations of meaningful scene elements was extracted rap- idly, shortly after the emergence of saliency-related information.
Kiat J.E.
,
Hayes T.R.
,
Henderson J.M.
,
Luck S.J.
PDF
Cite
DOI
Meaning and expected surfaces combine to guide attention during visual search in scenes
How do spatial constraints and meaningful scene regions interact to control overt attention during visual search for objects in real-world scenes? To answer this question, we combined novel surface maps of the likely locations of target objects with maps of the spatial distribution of scene semantic content.
Peacock C.E.
,
Cronin D.A.
,
Hayes T.R.
,
Henderson J.M.
PDF
Cite
DOI
Deep saliency models learn low-, mid-, and high-level features to predict scene attention
Deep saliency models represent the current state-of-the-art for predicting where humans look in real-world scenes. However, for deep saliency models to inform cognitive theories of attention, we need to know how deep saliency models prioritize different scene features to predict where people look. Here we open the black box of three prominent deep saliency models (MSI-Net, DeepGaze II, and SAM-ResNet) using an approach that models the association between attention, deep saliency model output, and low-, mid-, and high-level scene features.
Hayes T.R.
,
Henderson J.M.
PDF
Cite
DOI
Linking patterns of infant eye movements to a neural network model of the ventral stream using representational similarity analysis
Little is known about the development of higher-level areas of visual cortex during infancy, and even less is known about how the development of visually guided behavior is related to the different levels of the cortical processing hierarchy. As a first step toward filling these gaps, we used representational similarity analysis (RSA) to assess links between gaze patterns and a neural network model that captures key properties of the ventral visual processing stream.
Kiat J.E.
,
Luck S.J.
,
Beckner A.G.
,
Hayes T.R.
,
Pomaranski K.I.
,
Henderson J.M.
,
Oakes L.M.
PDF
Cite
DOI
Looking for Semantic Similarity: What a Vector Space Model of Semantics Can Tell Us About Attention in Real-world Scenes
Object semantics are theorized to play a central role in where we look in real-world scenes, but are poorly understood because they are hard to quantify. Here we tested the role of object semantics by combining a computational vector space model of semantics with eye tracking in real-world scenes. The results provide evidence that humans use their stored semantic representations of objects to help selectively process complex visual scenes, a theoretically important finding with implications for models in a wide range of areas including cognitive science, linguistics, computer vision, and visual neuroscience.
Hayes T.R.
,
Henderson J.M.
PDF
Cite
DOI
Supplement
Meaning maps capture the density of local semantic features in scenes: A reply to Pedziwiatr, Kümmerer, Wallis, Bethge & Teufel (2021)
Pedziwiatr, Kümmerer, Wallis, Bethge, & Teufel (2021) contend that Meaning Maps do not represent the spatial distribution of semantic features in scenes. We argue that Pedziwiatr et al. provide neither logical nor empirical support for that claim, and we conclude that Meaning Maps do what they were designed to do: represent the spatial distribution of meaning in scenes.
Henderson J.M.
,
Hayes T.R.
,
Peacock C.E.
,
Rehrig G.
PDF
Cite
DOI
Developmental changes in natural scene viewing in infancy
Pomaranski K.I.
,
Hayes T.R.
,
Kwon M.K.
,
Henderson J.M.
,
Oakes L.M.
Cite
DOI
»
Cite
×