How do Naturalistic Visuo-Auditory Cues Guide Human Attention? Insights from Systematic Explorations in Visual Perception of Embodied Multimodal Interaction

April 15, 2026·
Vipul Nair
Vipul Nair
Mehul Bhatt
Mehul Bhatt
Jakob Suchan
Jakob Suchan
Erik Billing
Erik Billing
Paul Hemeren
Paul Hemeren
Abstract
Studies in visual cognition highlight the importance of visual, spatial, and auditory cues in influencing human attention. Such cues often tend to be indicative of actions or events, thereby serving as predictive indicators in both passive observation as well as in interactive engagement. Our research focuses on visual attention in passive observation, particularly examining the manner in which visual, spatial, and auditory cues —henceforth visuoauditory (shorthand) cues— influence attention on everyday multimodal interaction. We systematically develop a visuoauditory event model for investigating visual attention in naturalistic embodied settings. Rooted in this event model, we explore the influence of five select visuoauditory cues —namely, speaking, gaze, relative motion, hand action, and visibility— on visual attention. Our analysis utilizes eye-tracking data from (90) participants observing (27) carefully designed naturalistic event scenarios and correlating their attentional metrics with the select visuoauditory cues in the backdrop of the developed event model. Findings reveal strong associations between attention and both intra-modal (irrespective of other cues) and cross-modal (combined with other cues) cueing effects, thereby highlighting the nuanced interplay amongst the cues influencing attentional patterns. We develop a systematic and generalized method for analyzing interactions and behavioral parameters, thereby characterizing the impact of visuoauditory cues on attentional dynamics. Our methodology, combined with the obtained insights into the attentional cueing effects, provides an analytical framework explicating the manner in which everyday (interactive) events directly drive attention under naturalistic conditions. This facilitates not only the precise modeling of behavior and attention allocation but also offers a high-level ‘experimental lens’ for examining interactions in relation to behavioral parameters. Taken together, our methodological and behavioral findings are well-positioned to benefit multiple fields, particularly by advancing human-centered design across diverse application domains. Lastly, towards promoting open-science and for wider dissemination, the complete experimental basis of this research —e.g., event-scenarios, high-quality annotated data, data-set supplementary— have been independently documented together with instructions to use and access experimental data.
Type
Publication
ACM Transactions on Applied Perception (ACM TAP 2026)
Vipul Nair
Authors
Cognition & AI Researcher |
Ph.D. in Informatics
Mehul Bhatt
Authors
Professor | Örebro University
Jakob Suchan
Authors
Assistant Professor | Constructor University
Erik Billing
Authors
Associate Professor | University of Skövde
Paul Hemeren
Authors
Associate Professor | University of Skövde