Virtual Reality Sustained Multimodal Distributional Semantics for Gestures in Dialogue (GeMDiS)

Project Participants

Dr. Andy Lücking
(Principal Investigator)

Goethe University Frankfurt
luecking@em.uni-frankfurt.de

Andy Lücking’s research contributes to a linguistic theory of human communication, that is, face-to-face interaction beyond single sentences. This involves the adaptation of dynamic dialogue semantics, the development of multimodal grammar extensions, occasionally the revision of traditional linguistic theories (e.g., quantification or pointing), the use of corpora and computational methods (as in the ViCom project GeMDiS), and taking an overarching cognitive perspective. Andy Lücking received a PhD in linguistics (Dr. phil.) in 2011 at Bielefeld University on iconicity and iconic gestures. He defended his habilitation in 2022 on “Aspects of multimodal communication” at the Laboratoire de Linguistique Formelle (LLF) at the Université Paris Cité.

Selected publications

On the broader picture of multimodal communication:
Andy Lücking and Jonathan Ginzburg. “Leading voices: Dialogue semantics, cognitive science, and the polyphonic structure of multimodal interaction”. Language and Cognition , Volume 15 , Issue 1 , January 2023 , pp. 148–172, DOI: https://doi.org/10.1017/langcog.2022.30
A dialogue- and gesture-friendly theory of quantification:
Andy Lücking and Jonathan Ginzburg. “Referential transparency as the proper treatment of quantification”. In: Semantics and Pragmatics 15, 4 (2022). doi: 10.3765/sp.15.4. (Early access: https://semprag.org/index.php/sp/article/view/sp.15.4)
A condensed, computational piece on how to assign meanings to (some kinds of) co-verbal gestures and integrate them in grammar:
Andy Lücking. “Modeling Co-verbal Gesture Perception in Type Theory with Records”. In: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Hrsg. von Maria Ganzha, Leszek Maciaszek und Marcin Paprzycki. Vol. 8. Annals of Computer Science and Information Systems. IEEE, Sep. 2016, pp. 383–392. doi: 10.15439/2016F83. (Available at: https://annals-csis.org/proceedings/2016/drp/83.html)
Pointing gestures as search instructions:
Andy Lücking. “Witness-loaded and Witness-free Demonstratives”. In: Atypical Demonstratives. Syntax, Semantics and Pragmatics. Hrsg. von Marco Coniglio, Andrew Murphy, Eva Schlachter und Tonjes Veenstra. Linguistische Arbeiten 568. Berlin und Boston: De Gruyter, 2018, pp. 255–284. ISBN: 978-3-11-056029-9.(Preprint: https://www.researchgate.net/publication/303667514_Witness-loaded_and_Witness-free_Demonstratives)

Project Description

Both corpus-based linguistics and contemporary computational linguistics rely on the use of often large, linguistic resources. The expansion of the linguistic subject area to include visual means of communication such as gesticulation has not yet been backed up with corresponding corpora. This means that “multimodal linguistics” and dialogue theory cannot participate in established distributional methods of corpus linguistics and computational semantics. The main reason for this is the difficulty of collecting multimodal data in an appropriate way and at an appropriate scale. Using the latest VR-based recording methods, the GeMDiS project aims to close this data gap and to investigate visual communication by means of machine-based methods and innovative use of neuronal and active learning for small data using the systematic reference dimensions of associativity and contiguity of the features of visual and non-visual communicative signs. GeMDiS is characterised above all by the following characteristics:

Ecological validity: the data collection takes place in dialogue situations and thus also takes a look at everyday gestures or interactive gestures in particular. In this respect, GeMDS differs from collections of partly emblematic hand shapes or gestural charades.
True multimodality: the VR-based recording technology records not only hand-and- arm movements and handshapes but also facial expressions — it is this proper multimodality that is the hallmark of natural language interaction. In this, GeMDS already anticipates potential further developments of ViCom.

The corpus created in this way is made available to the research community (FAIR principles). The results of GeMDS feed into social human-machine interaction, contribute to research on gesture families, and provide a basis for exploratory corpus analysis and further annotation. Furthermore, the project investigates to what extent the results obtained can serve formal semantics for the input problem of meaning representation (in short: in order to compute a multimodal meaning compositionally, it is first of all necessary to associate the linguistic and the non-vocal parts of an utterance with meanings, something that so far only happens intuitively). In the last phase of the project, a VR avatar will be developed into a playback medium of the previously recorded multimodal behaviour. This serves as a visual evaluation of the methodology. The avatar can also be used as an experimental platform, e.g. in cooperation with other projects.

Project Activities

Publications

Lücking, A., Ginzburg, J. (2025): Exceptions From Rules and Noteworthy Exceptions. The Balance Scale for Making Exceptions. In: Linguistics and Philosophy. Forthcoming.

Lücking, A. (2025): Deixis. In: Wörterbücher zur Sprach- und Kommunikationswissenschaft (WSK) – Semantik und Pragmatik. Ed. by Daniel Gutzmann, Katharina Turgay, and Thomas Ede Zimmermann. De Gruyter, 2025. DOI: 10.1515/wsk. Forthcoming.

Lücking, A., Voll, F., Rott, D., Henlein, A., Mehler, A. (2025): Head and Hand Movements During Turn Transitions: Data-Based Multimodal Analysis Using the Frankfurt VR Gesture–Speech Alignment Corpus (FraGA). In: Proceedings of the 29th Workshop on The Semantics and Pragmatics of Dialogue (SemDial 2025). https://semdial2025.github.io/

Henlein, A., Bauer, A., Bhattacharjee, R., Ćwiek, A., Gregori, A., Kügler, F. et al. (2024): An Outlook for AI Innovation in Multimodal Communication Research. In: Vincent G. Duffy (ed.): Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Cham: Springer (HCII 2024, Lecture Notes in Computer Science), pp. 182–234. DOI: 10.1007/978-3-031-61066-0_13.

Lücking, A., Mehler, A., Henlein, A. (2024): The Linguistic Interpretation of Non-emblematic Gestures Must Be Agreed in Dialogue: Combining Perceptual Classifiers and Grounding/Clarification Mechanisms. In: Proceedings of the 28th Workshop on The Semantics and Pragmatics of Dialogue (SemDial ’24 – TrentoLogue). Online verfügbar unter https://www.semdial.org/anthology/papers/Z/Z24/Z24-4031/.

Henlein, A., Lücking, A., Mehler, A. (2024): Virtually Restricting Modalities in Interactions: Va.Si.Li-Lab for Experimental Multimodal Research. In: Proceedings of the 2nd International Symposium on Multimodal Communication (MMSYM 2024), Frankfurt, 25–27 September 2024, S. 96–97. MMSYM 2024.

Lücking, A., Mehler, A., Henlein, A. (2024): The Gesture–Prosody Link in Multimodal Grammar. In: Proceedings of the 2nd International Symposium on Multimodal Communication (MMSYM 2024), Frankfurt, 25–27 September 2024, S. 128–129. MMSYM 2024.

Abrami, G., Mehler, A., Bagci, M., Schrottenbacher, P., Henlein, A., Spiekermann, C. et al. (2023): Va.Si.Li-Lab as a Collaborative Multi-User Annotation Tool in Virtual Reality and Its Potential Fields of Application. In: Proceedings of the 34th ACM Hypertext Conference (HT 23). DOI: 10.1145/3603163.3609076.

Ginzburg, J., Lücking, A. (2023): Referential Transparency and Inquisitivity. In: Proceedings of the 4th Workshop on Inquisitiveness Below and Beyond the Sentence Boundary (InqBnB4’23), pp. 11–20. https://aclanthology.org/2023.inqbnb-1.2/.

Henlein, A., Gopinath, A., Krishnaswamy, N., Mehler, A., Pustejovsky, J. (2023): Grounding Human–Object Interaction to Affordance Behavior in Multimodal Datasets. In: Frontiers in Artificial Intelligence 6. DOI: 10.3389/frai.2023.1084740.

Henlein, A., Kett, A., Baumartz, D., Abrami, G., Mehler, A., Bastian, J. et al. (2023): Semantic Scene Builder: Towards a Context-Sensitive Text-to-3D Scene Framework. In: Semantic, Artificial and Computational Interaction Studies: Towards a Behavioromics of Multimodal Communication. Held as Part of the 25th HCI International Conference (HCII 2023), Copenhagen, July 23–28, 2023. Springer. DOI: 10.3389/frai.2023.1084740.

Henlein, A., Lücking, A., Bagci, M., Mehler, A. (2023): Towards Grounding Multimodal Semantics in Interaction Data with Va.Si.Li-Lab. In: Proceedings of the 8th Conference on Gesture and Speech in Interaction (GESPIN). PDF.

Lücking, A. (2023): Towards Referential Transparent Annotations of Quantified Noun Phrases. In: Proceedings of the 2023 Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-19), pp. 47–55. https://aclanthology.org/2023.isa-1.7/.

Mehler, A., Bagci, M., Henlein, A., Abrami, G., Spiekermann, C., Schrottenbacher, P. et al. (2023): A Multimodal Data Model for Simulation-Based Learning with Va.Si.Li-Lab. In: Vincent G. Duffy (ed.): Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Springer Nature Switzerland, S. 539–565. DOI: 10.1007/978-3-031-35741-1_39.

Gregori, A., Amici, F., Brilmayer, I., Ćwiek, A., Fritzsche, L., Fuchs, S., Henlein, A., Herbort, O., Kügler, F., Lemanski, J., Liebal, K., Lücking, A., Mehler, A., Nguyen, K. T., Pouw, W., Prieto, P., Rohrer, P. L., Sánchez-Ramón, P. G., Schulte-Rüther, M., Schumacher, P. B., Schweinberger, S. R., Struckmeier, V., Trettenbrein, P. C. & von Eiff, C. I. (2023): A Roadmap for Technological Innovation in Multimodal Communication Research. In: Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, pp. 402–438. DOI: 10.1007/978-3-031-35748-0_30.

Lücking, A., Ginzburg, J. (2022): Leading Voices: Dialogue Semantics, Cognitive Science, and the Polyphonic Structure of Multimodal Interaction. In: Language and Cognition. DOI: 10.1017/langcog.2022.30.

Invited talks

Lücking, A. (Mar. 14, 2024): Gesture semantics: Deictic Reference, deferred reference, and iconic co-speech gestures. Invited talk at Stevan Harnad’s interdisciplinary seminar series in Cognitive Informatics at the Université du Québec à Montréal.

Lücking, A. (May 3, 2022): Pointing: From reference to attention and back. Invited talk at the Language Colloquium, Ruhr-Universität Bochum.

Summer school courses

Lücking, A., Henlein, A. (July 28—August 8, 202): Spatial Gesture Semantics. ESSLLI 2025 Advanced Course, Ruhr University Bochum

Short-term Collaborations

Examining mouthings with Virtual Reality (VR) glasses (Meta Quest Pro) (2024) – Alexander Henlein (GeMDiS), Andy Lücking (GeMDiS), Alexander Mehler (GeMDiS), Anastasia Bauer (GeSi)