Fix README: correct repo id, ONNX vs PT split, GitHub link
Browse files
README.md
CHANGED
|
@@ -1,26 +1,51 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
-
# Blue
|
| 6 |
|
| 7 |
-
This repository contains
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
-
|
| 12 |
|
| 13 |
-
|
| 14 |
-
- `stats_multilingual.pt`: The statistical data containing the latent means and standard deviations computed from the corpus.
|
| 15 |
-
- `vf_estimator.safetensors`: The combined Text-to-Latent acoustic checkpoints (includes text encoder, reference encoder, and the Flow Matching model).
|
| 16 |
-
- `duration_predictor.safetensors`: The Duration Predictor checkpoint.
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
```bash
|
| 23 |
-
huggingface-cli download notmax123/
|
| 24 |
```
|
| 25 |
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- multilingual
|
| 5 |
+
tags:
|
| 6 |
+
- text-to-speech
|
| 7 |
+
- speech-synthesis
|
| 8 |
+
- hebrew
|
| 9 |
+
pipeline_tag: text-to-speech
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# Blue — PyTorch weights (training, finetuning & voice export)
|
| 13 |
|
| 14 |
+
This repository contains **Safetensors / PyTorch checkpoints** and **multilingual latent statistics** for **[BlueTTS](https://github.com/maxmelichov/BlueTTS)** — Hebrew-first multilingual text-to-speech with optional English, Spanish, Italian, German, and mixed-language synthesis in the reference code.
|
| 15 |
|
| 16 |
+
**Project home (install, ONNX inference, examples):** [https://github.com/maxmelichov/BlueTTS](https://github.com/maxmelichov/BlueTTS)
|
| 17 |
|
| 18 |
+
> **End-user synthesis:** Use the ONNX model bundle **[`notmax123/blue-onnx`](https://huggingface.co/notmax123/blue-onnx)** with the BlueTTS README. This **`notmax123/blue`** repo supplies **training / finetuning weights** and files needed to **export new voice style JSON** for ONNX; it is not the ONNX runtime bundle.
|
| 19 |
|
| 20 |
+
## Files
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
+
| File | Role |
|
| 23 |
+
|------|------|
|
| 24 |
+
| `blue_codec.safetensors` | Audio codec: mel ↔ latent, discrete/continuous conversion. |
|
| 25 |
+
| `stats_multilingual.pt` | Latent mean/std for normalization (same statistics as training). |
|
| 26 |
+
| `vf_estimator.safetensors` | Text-to-latent acoustic model (text encoder, reference encoder, flow-matching core). |
|
| 27 |
+
| `duration_predictor.safetensors` | Duration predictor checkpoint. |
|
| 28 |
|
| 29 |
+
## Download
|
| 30 |
+
|
| 31 |
+
Repo id is **case-sensitive** — use `notmax123/blue` (not `Blue`).
|
| 32 |
+
|
| 33 |
+
```bash
|
| 34 |
+
hf download notmax123/blue --repo-type model --local-dir ./pt_weights
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
Equivalent with the classic CLI:
|
| 38 |
|
| 39 |
```bash
|
| 40 |
+
huggingface-cli download notmax123/blue --repo-type model --local-dir ./pt_weights
|
| 41 |
```
|
| 42 |
|
| 43 |
+
## How to use
|
| 44 |
+
|
| 45 |
+
1. **Training or finetuning:** Follow the [training](https://github.com/maxmelichov/BlueTTS/tree/main/training) directory in the BlueTTS GitHub repository.
|
| 46 |
+
|
| 47 |
+
2. **New voices for ONNX inference:** Clone [BlueTTS](https://github.com/maxmelichov/BlueTTS), install with the `export` extra, download these weights locally, and run `scripts/export_new_voice.py` (see script docstring and project README).
|
| 48 |
+
|
| 49 |
+
## License
|
| 50 |
+
|
| 51 |
+
MIT — see the [BlueTTS repository](https://github.com/maxmelichov/BlueTTS) for the full license text.
|