DREAM Data Pipeline

DREAM Data Pipeline

Overview

Data Processing Support for DREAM (EU Horizon 2020 Project)
📅 Oct 2017 – Mar 2019
🏛️ University of Skövde, Sweden
👥 Collaborator: Erik Billing

As a project assistant, I supported data infrastructure and automation for the DREAM (Development of Robot-Enhanced therapy for children with Autism Spectrum Disorders) project—an EU-funded Horizon 2020 initiative. The project aimed to explore robot-assisted therapy using the Nao robot in clinical and research settings.


Project Role

I developed and maintained data processing pipelines to handle the multimodal content generated during experimental sessions involving autistic children. Each session included:

  • Multi-angle video recordings (5 cameras)
  • Audio capture
  • Interaction event logs
  • Behavioral annotations (ELAN)
  • Structured metadata (CSV, JSON)

Key Contributions

  1. Unified Data Consolidation Scripts to:
    - Synchronize multiple camera feeds and audio recordings
    - Merge these into unified, session-specific media bundles
    - Preserve and standardize timestamps across modalities

  2. Automated Annotation Integration Tools to:
    - Align raw behavioral logs with media content
    - Auto-generate ELAN annotations for key events (e.g., therapist instructions, robot prompts)
    - Reduce manual annotation effort while preserving event granularity

  3. Structured Output Generation for:
    - Standardized CSV/JSON representations of each session
    - Machine-readable summaries of behavioral and task-related events
    - Clean datasets suitable for downstream statistical and ML-based analyses


Tools & Technologies

  • Python (data handling, scripting, automation)
  • ELAN (annotation, XML handling)
  • FFmpeg (media transformation and merging)
  • Pandas / NumPy (data manipulation)
  • JSON / CSV (structured output formats)

Impact

  • Streamlined processing of high-volume multimodal data
  • Ensured reproducibility and consistency across research sites
  • Enabled publication of a structured dataset, contributing to a peer-reviewed dataset paper
  • Laid the foundation for several high-impact publications in behavioral and cognitive robotics

Project Outcomes

  • 1 Dataset publication (co-authored)
  • Multiple research publications using the processed data
  • Reusable processing tools and templates for similar multimodal research

Discussion Invite

Open to discussion on the technicalities or methods. Happy to exchange ideas and explore new perspectives.