Text-to-Speech
Safetensors
English
voxtream
zero-shot
streaming

Add authors, library_name and link to paper page

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +12 -7
README.md CHANGED
@@ -1,18 +1,23 @@
1
  ---
2
- license: cc-by-4.0
3
  datasets:
4
  - amphion/Emilia-Dataset
5
  - nvidia/hifitts-2
6
  language:
7
  - en
 
8
  pipeline_tag: text-to-speech
 
9
  tags:
10
  - text-to-speech
 
 
11
  ---
12
 
13
  # Model Card for VoXtream2
14
 
15
- VoXtream2 is a zero-shot full-stream TTS model with dynamic speaking-rate control that can be updated mid-utterance on the fly.
 
 
16
 
17
  ### Key features
18
 
@@ -22,10 +27,10 @@ VoXtream2 is a zero-shot full-stream TTS model with dynamic speaking-rate contro
22
 
23
  ### Model Sources
24
 
25
- - **Repository:** [repo](https://github.com/herimor/voxtream)
26
- - **Paper:** [paper](https://arxiv.org/pdf/2603.13518)
27
- - **Demo Page:** [demo page](https://herimor.github.io/voxtream2)
28
- - **Live Demo:** [live demo](https://huggingface.co/spaces/herimor/voxtream2)
29
 
30
  ## Get started
31
 
@@ -84,7 +89,7 @@ The model was trained on [Emilia](https://huggingface.co/datasets/amphion/Emilia
84
 
85
  ## Citation
86
 
87
- ```
88
  @inproceedings{torgashov2026voxtream,
89
  title={Vo{X}tream: Full-Stream Text-to-Speech with Extremely Low Latency},
90
  author={Torgashov, Nikita and Henter, Gustav Eje and Skantze, Gabriel},
 
1
  ---
 
2
  datasets:
3
  - amphion/Emilia-Dataset
4
  - nvidia/hifitts-2
5
  language:
6
  - en
7
+ license: cc-by-4.0
8
  pipeline_tag: text-to-speech
9
+ library_name: voxtream
10
  tags:
11
  - text-to-speech
12
+ - zero-shot
13
+ - streaming
14
  ---
15
 
16
  # Model Card for VoXtream2
17
 
18
+ VoXtream2 is a zero-shot full-stream TTS model with dynamic speaking-rate control that can be updated mid-utterance on the fly. It was introduced in the paper [VoXtream2: Full-stream TTS with dynamic speaking rate control](https://huggingface.co/papers/2603.13518).
19
+
20
+ **Developed by:** Nikita Torgashov, Gustav Eje Henter, Gabriel Skantze
21
 
22
  ### Key features
23
 
 
27
 
28
  ### Model Sources
29
 
30
+ - **Repository:** [https://github.com/herimor/voxtream](https://github.com/herimor/voxtream)
31
+ - **Paper:** [https://huggingface.co/papers/2603.13518](https://huggingface.co/papers/2603.13518)
32
+ - **Demo Page:** [https://herimor.github.io/voxtream2](https://herimor.github.io/voxtream2)
33
+ - **Live Demo:** [https://huggingface.co/spaces/herimor/voxtream2](https://huggingface.co/spaces/herimor/voxtream2)
34
 
35
  ## Get started
36
 
 
89
 
90
  ## Citation
91
 
92
+ ```bibtex
93
  @inproceedings{torgashov2026voxtream,
94
  title={Vo{X}tream: Full-Stream Text-to-Speech with Extremely Low Latency},
95
  author={Torgashov, Nikita and Henter, Gustav Eje and Skantze, Gabriel},