Add sample usage section and update license in model card
#2
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
---
|
| 2 |
-
license: cc-by-4.0
|
| 3 |
language:
|
| 4 |
-
|
|
|
|
| 5 |
pipeline_tag: text-to-speech
|
| 6 |
tags:
|
| 7 |
- voxtream
|
|
@@ -28,6 +28,28 @@ VoXtream, a fully autoregressive, zero-shot streaming text-to-speech system for
|
|
| 28 |
|
| 29 |
Clone our [repo](https://github.com/herimor/voxtream) and follow instructions in README file.
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
### Out-of-Scope Use
|
| 32 |
|
| 33 |
Any organization or individual is prohibited from using any technology mentioned in this paper to generate someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
+
- en
|
| 4 |
+
license: mit
|
| 5 |
pipeline_tag: text-to-speech
|
| 6 |
tags:
|
| 7 |
- voxtream
|
|
|
|
| 28 |
|
| 29 |
Clone our [repo](https://github.com/herimor/voxtream) and follow instructions in README file.
|
| 30 |
|
| 31 |
+
### Usage
|
| 32 |
+
|
| 33 |
+
#### Output streaming
|
| 34 |
+
```bash
|
| 35 |
+
python run.py \
|
| 36 |
+
--prompt-audio assets/audio/male.wav \
|
| 37 |
+
--prompt-text "The liquor was first created as 'Brandy Milk', produced with milk, brandy and vanilla." \
|
| 38 |
+
--text "In general, however, some method is then needed to evaluate each approximation." \
|
| 39 |
+
--output "output_stream.wav"
|
| 40 |
+
```
|
| 41 |
+
* Note: Initial run may take some additional time to prepare MFA alignment for the prompt.
|
| 42 |
+
|
| 43 |
+
#### Full streaming
|
| 44 |
+
```bash
|
| 45 |
+
python run.py \
|
| 46 |
+
--prompt-audio assets/audio/female.wav \
|
| 47 |
+
--prompt-text "Betty Cooper helps Archie with cleaning a store room, when Reggie attacks her." \
|
| 48 |
+
--text "Staff do not always do enough to prevent violence." \
|
| 49 |
+
--output "full_stream.wav" \
|
| 50 |
+
--full-stream
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
### Out-of-Scope Use
|
| 54 |
|
| 55 |
Any organization or individual is prohibited from using any technology mentioned in this paper to generate someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.
|