Multimodal Assessment of Dyadic Interaction in Disorders of Social Interaction

Project Participants

Project Description

Autism Spectrum Disorder (ASD) is a prototypic disorder for the impairment of multimodal aspects of visual and verbal communication. Observational instruments such as the Autism Diagnostic Observation Schedule (ADOS-2) provide assessment of behavioral symptoms via a structured social encounter with an experienced clinician who performs several social interactive tasks with the individual. Diagnostic decisions based on this instrument typically rely on qualitative, clinical ratings regarding the clinician’s impression based on visual communicative behavior (such as eye-gaze, facial expressions, gestures) and their integration with verbal communication), however, quantitative indices are lacking.
In human interaction, verbal and visual channels of communication are embedded in a social reference frame combining multimodal aspects. For example, joint attention emerges from using deictic hand gestures (e.g. pointing) in coordination with facial expression and eye gaze to navigate within a shared attentional space of other people and objects. Eye gaze and pointing can clarify which object or person an utterance refers to, e.g., gestures, head movement and facial expression may visualize spatial and social relationships when talking about objects or persons which are not currently visible. Systematic research that explicitly tackles the interplay and temporal dynamics of such multimodal visual communicative behavior is scarce, to date, and would benefit from a fine-grained computerized assessment of dyadic interaction. Motion capture, mobile eye-tracking and automatic facial expression analysis are established techniques in this respect and have proved potential for the diagnostic assessment of disorders such as ASD. What is missing in previous research, however, is the multimodal combination of available techniques during a standardized assessment resulting in a rich, annotated dataset.

In this proposal we aim at providing multimodal assessment of typical and atypical social behavior, focusing on the integration of multiple visual and verbal communication channels and their relation to disorders of social interaction in children. We will perform annotation of specific behavioral events during ongoing reciprocal interaction of a child and an investigator, in particular related to joint attention and reciprocity. Subsequently, we will use machine learning (ML) methods on time series of automatically extracted aspects (e.g. saccades towards faces, facial expression, body pose motion capture) to train models for the automatic identification of these events. Furthermore, we seek to use ML to support the clinical characterization of non-verbal behavior, in particular related to ASD. At the same time, we will generate a rich dataset, i.e. a corpus of multimodal communication during social interaction which will allow for numerous further analyses in the context of the Priority Program ViCom and for the wider research community.