Studying head nods with Computer Vision tool OpenPose

Researchers involved: Anastasia Bauer (GeSi, University of Cologne), Anna Kuder(GeSi, University of Cologne), and Marc Schulder (external partner, University of Hamburg)

In this collaboration we analyze head nods in German Sign Language (DGS), using the Public DGS Corpus (Konrad et al., 2020). We identify that head nods simultaneously occur with various lexical and non-lexical manual forms, but more frequently head nods are produced on their own or replace the manual signs. The aim of the study is two-fold. First, we describe the basic phonetic properties of head nods in DGS alongside their interaction with manual signs. Secondly, we investigate whether the phonetic properties of head nods fulfilling various functions differ significantly. We hypothesize that head nods signalling affirmation are to have a larger amplitude than head nods functioning as feedback in interaction. We define feedback very broadly as interactional moves that display some kind of stance towards the information represented by another signer. In our data, we thus consider feedback signals in different functions (e.g., as continuers, acknowledgement tokens, change-of-state tokens, assessments and repairs). Head nods signaling feedback are assumed to have a smaller amplitude but a longer duration of the movement and to be produced without co-occurring manual signs.
To test this prediction, we use the body pose information provided by the Public DGS Corpus. Based on this information, which was automatically generated using the Computer Vision (CV) tool OpenPose (Cao et al., 2019), we calculate head nods measurements from video recordings, and investigate the head nods in DGS in terms of the number of peaks, the amplitude of the nod, the frequency and the duration of these head movements. We also inspect the spectral qualities of head nods through the use of Short-time Fourier transforms. For this study, we investigated 94 min of naturalistic dyadic interaction from 8 native DGS signers (one text from each of four inspected dyads, lasting 18 min, 16 min, 9 min and 4 min respectively). Prior to the analysis, we identified all head nods in the data and annotated them manually in ELAN. Consequently, we used the pose information to compute statistics about head nodding to further analyze phonetic properties of these movements quantitatively.