hsi-workshop/referential-gesture-challenge
Viewer • Updated • 2.53k
None defined yet.
Data, benchmarks, and demos for the Workshop on Human–Scene Interaction Workshop.
Given speech, a 3D target coordinate, and a scene, generate SMPL-X pointing gestures that are temporally aligned, spatially accurate, and referentially grounded. Submissions are scored on three axes: temporal alignment, spatial accuracy, and referent recall.
📄 Challenge Paper · 📦 Data & Baseline · 🎯 Interactive Demo
| Resource | Description |
|---|---|
| MM-Conv | ~2K pointing clips from naturalistic VR dialogue with 3D scene graphs |
| SGS-HSI | 1,138 synthetic single-target pointing clips |
| OmniControl-PT baseline | Reference baseline (code & weights coming soon) |
| Milestone | Date |
|---|---|
| Challenge opens | May 5, 2026 |
| Submission deadline | July 7, 2026 |
| Results announced | July 31, 2026 |
| Workshop | October 2026 |
Jonas Beskow (KTH) · Rishabh Dabral (MPI) · Anna Deichler (KTH) · Fethiye Irmak Doğan (Cambridge) · Anindita Ghosh (MPI) ·