--- license: apache-2.0 pipeline_tag: text-to-speech tags: - text-to-speech - tts - voice-cloning - multilingual - flow-matching - autoregressive - dots.tts language: - en - zh - yue - ja - ko - fr - de - es - it - pt - nl - ru - uk - pl - cs - ro - el - fi - tr - ar - hi - vi - id - th library_name: dots.tts --- # dots.tts — consolidated mirror A consolidated mirror of [RedNote HiLab](https://huggingface.co/rednote-hilab)'s **dots.tts** checkpoints, repackaged into one repo with a subfolder per checkpoint for download-on-demand inside the MAESTRO app. dots.tts is a 2-billion-parameter fully-continuous **autoregressive text-to-speech** system: a Qwen2.5-1.5B backbone + a flow-matching acoustic head over a 48 kHz AudioVAE, with a CAM++ speaker x-vector encoder. It does zero-shot voice cloning, 24-language multilingual synthesis (auto-detect + code-switching), streaming, and realtime duplex dialogue, all at 48 kHz. ## Layout | Subfolder | Upstream | Notes | |-----------|----------|-------| | `base/` | [rednote-hilab/dots.tts-base](https://huggingface.co/rednote-hilab/dots.tts-base) | Balanced pretrained baseline | | `soar/` | [rednote-hilab/dots.tts-soar](https://huggingface.co/rednote-hilab/dots.tts-soar) | Self-corrective-aligned — best voice cloning (recommended) | | `mf/` | [rednote-hilab/dots.tts-mf](https://huggingface.co/rednote-hilab/dots.tts-mf) | MeanFlow-distilled student — 4-step, fastest inference | Each subfolder contains the full checkpoint (`model.safetensors`, `vocoder.safetensors`, `speaker_encoder.safetensors`, `config.json`, `llm_config.json`, `latent_stats.pt`, tokenizer files). The MAESTRO model manifest fetches only the requested subfolder via `allow_patterns`. ## License **Apache-2.0** — both code and weights, per the upstream release. Commercial use permitted. All credit for the model and weights goes to the dots.tts Team at RedNote (HiLab). This mirror only repackages the upstream artifacts unchanged; no weights were modified.