Citation: C. Lin, K. A. Flanigan, and S. Munir, “Inferring human kinesic functions from movement for social interaction analysis in built environments,” Journal of Computing in Civil Engineering, accepted for publication, 2026, doi: 10.1061/JCCEE5/CPENG-7514.

Introduction to Kinesics Recognition Framework

Humans utilize a range of channels to deliver their thought processes, including verbal messages, facial expression, body language, and etc. Among these channels, body language plays a critical role since it conveys unspoken cues about an individual's mental and emotional states through action movements. To study the meanings and interpretations expressed through bodily movements, psychologists Ekman and Friesen developed the taxonomy of kinesics that classifies bodily movements into five categories based on their communication functions: emblems, illustrators, affect displays, adaptors, and regulators. This principled taxonomy defines clear linkage between human activities and their respective meanings and communicative categories. In conjunction with human activity recognition (HAR)---which use sensor data such as RGB video, depth maps, or 3D skeletal keypoint to identify actions---the opportunity to automatically recognize the kinesics of human movements arises. This opportunity entails compiling a dictionary-based mapping that corresponds human activities to their communicative categories. The dictionary is then appended to an HAR algorithm to determine the kinesic function of a given activity after its recognized. However, the sheer variety of human actions makes it infeasible to manually define mappings for every possible movement. To truely decode human reasoning through bodily movements, we must move beyond dictionary-based mapping and towards methods capable of learning a generalized translation between physical actions and their cognitive and affective significance.

In this repository, we present a framework that classifies the kinesic categories of human movements. To be more specific, the framework leverages a structured pattern embedded in skeletal keypoint data that clusters different human activities with same communicative purpose together. The framework extracts the structure through a transfer learning model that consists of a Spatial-Temporal Graph Convolutional Network (ST-GCN) and a convolutional neural network (CNN). This model is implemented on a HAR dataset that is derived from the taxonomy of Kinesics --- the Dyadic User EngagemenT dataset (DUET) to demonstrate its efficacy.

Kinesics Recognition Framework Pipeline

The code in this repository runs the framework on 30 subsets of DUET, and each subset contains different numbers and types of activities. To implement the framework, please follow the steps below:

Duplicate the folder structure of the repository.
Download the 3D joints from the DUET repository and store all the folders (e.g., CC0101 and CL0102) in the data directory.
Download all the required packages in requirement.txt.
Run python disposition.py.
After the completion of the code, the results will be stored in experiment_results.pkl, which include the experiment number, the number of interactions, the type of interaction, the accuracy of the kinesics recognition for the given interactions. To acquire the accuracy of ST-GCN of the experiment, go to the logging file of the corresponding experiment (i.e., work_dirs/*experiment_N*/*execution_timestamps*/*execution_timestamps.txt*.) For instance, if you would like to determine the ST-GCN accuracy of experiment 3 that is executed at 11:07:50 on May 1, 2025, go to work_dirs/*experiment_3*/20250501_110750/20250501_110750.txt*.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

saluslab
/

DUET_kinesics_recognition

Introduction to Kinesics Recognition Framework

Kinesics Recognition Framework Pipeline

Dataset used to train saluslab/DUET_kinesics_recognition