--- license: mit base_model: Qwen/Qwen2-7B-Instruct tags: - speech-to-speech - dialogue - full-duplex - asr - tts - llm - qwen2 - vad-free - micro-turn language: - en pipeline_tag: text-generation --- # DuplexCascade: Full-Duplex Speech-to-Speech Dialogue with VAD-Free Cascaded ASR-LLM-TTS Pipeline and Micro-Turn Optimization This repository provides the model for **DuplexCascade**, a full-duplex speech-to-speech dialogue system built on a cascaded ASR-LLM-TTS pipeline with **VAD-free interaction** and **micro-turn optimization**. The backbone large language model is [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct), which was further fine-tuned for our duplex dialogue setting. ## Paper Our paper is available on arXiv: **Paper:** https://arxiv.org/abs/2603.09180 ## Inference Code Please refer to our GitHub repository for inference and implementation details: **GitHub:** https://github.com/sbintuitions/DuplexCascade ## Model Description DuplexCascade is designed for full-duplex spoken dialogue, enabling more natural interaction through: - A cascaded **ASR-LLM-TTS** pipeline - **VAD-free** dialogue control - **Micro-turn optimization** for smoother turn-taking behavior This model is obtained by fine-tuning [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) for the full-duplex dialogue setting. ## Base Model - **Base LLM:** [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) ## License This model is released under the **MIT License**.