---
license: apache-2.0
pipeline_tag: text-to-speech
tags:
- text-to-speech
- tts
- voice-cloning
- multilingual
- flow-matching
- autoregressive
- dots.tts
language:
- en
- zh
- yue
- ja
- ko
- fr
- de
- es
- it
- pt
- nl
- ru
- uk
- pl
- cs
- ro
- el
- fi
- tr
- ar
- hi
- vi
- id
- th
library_name: dots.tts
---

# dots.tts — consolidated mirror

A consolidated mirror of [RedNote HiLab](https://huggingface.co/rednote-hilab)'s
**dots.tts** checkpoints, repackaged into one repo with a subfolder per
checkpoint for download-on-demand inside the MAESTRO app.

dots.tts is a 2-billion-parameter fully-continuous **autoregressive
text-to-speech** system: a Qwen2.5-1.5B backbone + a flow-matching acoustic head
over a 48 kHz AudioVAE, with a CAM++ speaker x-vector encoder. It does zero-shot
voice cloning, 24-language multilingual synthesis (auto-detect + code-switching),
streaming, and realtime duplex dialogue, all at 48 kHz.

## Layout

| Subfolder | Upstream | Notes |
|-----------|----------|-------|
| `base/`   | [rednote-hilab/dots.tts-base](https://huggingface.co/rednote-hilab/dots.tts-base) | Balanced pretrained baseline |
| `soar/`   | [rednote-hilab/dots.tts-soar](https://huggingface.co/rednote-hilab/dots.tts-soar) | Self-corrective-aligned — best voice cloning (recommended) |
| `mf/`     | [rednote-hilab/dots.tts-mf](https://huggingface.co/rednote-hilab/dots.tts-mf)   | MeanFlow-distilled student — 4-step, fastest inference |

Each subfolder contains the full checkpoint (`model.safetensors`,
`vocoder.safetensors`, `speaker_encoder.safetensors`, `config.json`,
`llm_config.json`, `latent_stats.pt`, tokenizer files). The MAESTRO model
manifest fetches only the requested subfolder via `allow_patterns`.

## License

**Apache-2.0** — both code and weights, per the upstream release. Commercial use
permitted. All credit for the model and weights goes to the dots.tts Team at
RedNote (HiLab). This mirror only repackages the upstream artifacts unchanged;
no weights were modified.