Papers
arxiv:2603.22267

TiCo: Time-Controllable Spoken Dialogue Model

Published on May 13
Authors:
,
,
,
,

Abstract

TiCo is a time-controllable spoken dialogue model that uses Spoken Time Markers and reinforcement learning to generate responses with specified durations while maintaining quality.

AI-generated summary

We introduce TiCo, a time-controllable spoken dialogue model (SDM) that follows time-constrained instructions (e.g., "Please generate a response lasting about 15 seconds") and generates spoken responses with controllable duration. This capability is valuable for real-world spoken language systems such as voice assistants and interactive agents, where controlling response duration can improve interaction quality. However, despite their strong ability to generate natural spoken responses, existing models lack time awareness and struggle to follow duration-related instructions. To systematically evaluate this, we introduce TiCo-Bench, the first benchmark for time-controllable instruction following in SDMs, on which existing open-source and commercial models frequently fail to satisfy explicit time constraints. TiCo addresses this limitation by enabling an SDM to estimate elapsed speaking time during generation through Spoken Time Markers (STM) (e.g., <10.6 seconds>). These markers help the model maintain awareness of time and adjust the remaining content to meet the target duration. TiCo is post-trained efficiently without question-answer paired data, relying on self-generation and reinforcement learning with verifiable reward. Experimental results show that TiCo reduces duration error by 2.7x over its backbone and 1.6x over the strongest baseline, while preserving response quality.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.22267
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.22267 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.22267 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.