Attention on Films

Attention on Films

Overview

How do visuospatial features guide collective attention in films?
๐Ÿ“… Jan 2020 โ€“ Dec 2021
๐Ÿ›๏ธ University of Skรถvde, Sweden; ร–rebro University, Sweden; German Aerospace Center (DLR)
๐Ÿ‘ฅ Collaborators: Mehul Bhatt, Jakob Suchan, Paul Hemeren

This research investigated how visuospatial attributes in narrative films direct viewers’ attention, using eye-tracking (32 participants/scene) and a novel visuospatial event model for semantic annotation. We analyzed 10 film scenes from the Moving Images dataset to correlate attentional measures with multimodal scene features.


Aim

To establish a method for semantically grounding events and examine how visuospatial cues (gaze, motion, spatial relations) influence:

  • Attentional measures including synchorny (shared gaze patterns)
  • Predictive (anticipatory) viewing behaviors

Methodology

  • Dataset: 10 scenes from acclaimed films (Solaris, Goodfellas, Grand Budapest Hotel etc.)
  • Participants: 32 viewers per scene tracked at 60Hz using Tobii X2-60
  • Annotation Framework: Expert annotations using ELAN tool with controlled vocabulary for:
    • Scene Elements: Objects, regions, body parts
    • Scene Structure: Visibility, motion, spatial relations, gaze, actions
    • Visual Attention: Fixations/saccades (low-level) + object-level attention
  • Analysis Metrics:
    • Attentional synchrony: % viewers fixating same region/body part
    • Feature distribution in high/low synchrony segments

Visuospatial Event Model

A formal taxonomy for semantic interpretation of scenes. See example below:

Category Features
Scene Elements Objects, regions, body parts
Scene Structure visible(X), moving_towards(X,Y), gazing_at(X,Y)
Visual Attention attention_on(face(X))

Results

  1. High synchrony correlates with:

    • Isolated characters in frame
    • Specific behavioral cues (e.g., sudden gaze shifts)
    • Visibility changes (e.g., occlusions)
  2. Low synchrony segments contained 32% more event data (scene structure annotations), suggesting richer interpretative possibilities when attention diverges

  3. Key attention drivers:

    • Head movements (predictive of upcoming actions)
    • Gaze transitions between agents
    • Hand actions with referential significance

Applications

  • AI Systems: Grounding for human-activity recognition
  • Directing Practices: Quantifying attention-guiding techniques
  • Extended Reality: Predictive gaze models for VR narrative
  • Cognitive Modeling: Benchmark for human-like event understanding
  • Attention prediction models using visuospatial features

Project Outcomes

๐Ÿ“š See Publication


Collaboration Opportunities

Open to collaboration or discussion on methodology, data, or future directions. Happy to exchange ideas and explore new perspectives.