arxiv:2506.14677

Human-Centered Editable Speech-to-Sign-Language Generation via Streaming Conformer-Transformer and Resampling Hook

Published on Jun 17, 2025

Authors:

Abstract

A real-time speech-to-sign animation system integrates a streaming Conformer encoder with a Transformer-MDN decoder, JSON-based editing, and human-in-the-loop optimization to improve naturalness and user control.

AI-generated summary

Existing end-to-end sign-language animation systems suffer from low naturalness, limited facial/body expressivity, and no user control. We propose a human-centered, real-time speech-to-sign animation framework that integrates (1) a streaming Conformer encoder with an autoregressive Transformer-MDN decoder for synchronized upper-body and facial motion generation, (2) a transparent, editable JSON intermediate representation empowering deaf users and experts to inspect and modify each sign segment, and (3) a human-in-the-loop optimization loop that refines the model based on user edits and ratings. Deployed on Unity3D, our system achieves a 13 ms average frame-inference time and a 103 ms end-to-end latency on an RTX 4070. Our key contributions include the design of a JSON-centric editing mechanism for fine-grained sign-level personalization and the first application of an MDN-based feedback loop for continuous model adaptation. This combination establishes a generalizable, explainable AI paradigm for user-adaptive, low-latency multimodal systems. In studies with 20 deaf signers and 5 professional interpreters, we observe a +13 point SUS improvement, 6.7 point reduction in cognitive load, and significant gains in naturalness and trust (p < .001) over baselines. This work establishes a scalable, explainable AI paradigm for accessible sign-language technologies.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2506.14677

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.14677 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.14677 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.14677 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.