DuplexCascade / README.md
user02169
Initial public release
dca21cb
---
license: mit
base_model: Qwen/Qwen2-7B-Instruct
tags:
- speech-to-speech
- dialogue
- full-duplex
- asr
- tts
- llm
- qwen2
- vad-free
- micro-turn
language:
- en
pipeline_tag: text-generation
---
# DuplexCascade: Full-Duplex Speech-to-Speech Dialogue with VAD-Free Cascaded ASR-LLM-TTS Pipeline and Micro-Turn Optimization
This repository provides the model for **DuplexCascade**, a full-duplex speech-to-speech dialogue system built on a cascaded ASR-LLM-TTS pipeline with **VAD-free interaction** and **micro-turn optimization**.
The backbone large language model is [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct), which was further fine-tuned for our duplex dialogue setting.
## Paper
Our paper is available on arXiv:
**Paper:** https://arxiv.org/abs/2603.09180
## Inference Code
Please refer to our GitHub repository for inference and implementation details:
**GitHub:** https://github.com/sbintuitions/DuplexCascade
## Model Description
DuplexCascade is designed for full-duplex spoken dialogue, enabling more natural interaction through:
- A cascaded **ASR-LLM-TTS** pipeline
- **VAD-free** dialogue control
- **Micro-turn optimization** for smoother turn-taking behavior
This model is obtained by fine-tuning [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) for the full-duplex dialogue setting.
## Base Model
- **Base LLM:** [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct)
## License
This model is released under the **MIT License**.