Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Supertonic – FP16 vs INT8 Quantized Benchmark (by Shadow0482)
|
| 2 |
+
|
| 3 |
+
This README documents a simple benchmark comparing **FP16** and **INT8** quantized
|
| 4 |
+
versions of the [Supertonic](https://huggingface.co/Supertone/supertonic) TTS
|
| 5 |
+
pipeline using the quantized models hosted at:
|
| 6 |
+
|
| 7 |
+
- Quantized models repo: **Shadow0482/supertonic-quantized**
|
| 8 |
+
- This model card: **Shadow0482/supertonic-quantized**
|
| 9 |
+
|
| 10 |
+
All tests were run in Google Colab on CPU using the official `py/example_onnx.py`
|
| 11 |
+
script from the Supertonic GitHub repository.
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Test text
|
| 16 |
+
|
| 17 |
+
The same text was used for both FP16 and INT8 runs:
|
| 18 |
+
|
| 19 |
+
> Greetings! You are listening to your newly quantized model. I have been squished, squeezed, compressed, minimized, optimized, digitized, and lightly traumatized to save disk space. The testing framework automatically verifies my integrity, measures how much weight I lost, and checks if I can still talk without glitching into a robot dolphin. If you can hear this clearly, the quantization ritual was a complete success.
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Results
|
| 24 |
+
|
| 25 |
+
| Variant | Precision | ONNX directory | Time (s) | Output WAV | Status |
|
| 26 |
+
|--------:|-----------|-----------------------|---------:|----------------------------------|--------|
|
| 27 |
+
| FP16 | float16 | `onnx_fp16/` | 0.290 | `NONE` | FAILED |
|
| 28 |
+
| INT8 | int8 | `onnx_int8/` | 0.246 | `NONE` | FAILED |
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
> Note:
|
| 33 |
+
> - Exact times will vary depending on Colab hardware, runtime load, and ONNX Runtime version.
|
| 34 |
+
> - The goal of this benchmark is to confirm that both FP16 and INT8 quantized models
|
| 35 |
+
> load correctly and produce intelligible audio for the same input text.
|
| 36 |
+
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
+
## How to reproduce this benchmark
|
| 40 |
+
|
| 41 |
+
In a fresh Colab notebook:
|
| 42 |
+
|
| 43 |
+
1. Install dependencies and run the benchmark cell (the one that created this README).
|
| 44 |
+
2. Make sure you have write access to both:
|
| 45 |
+
- `Shadow0482/supertonic`
|
| 46 |
+
- `Shadow0482/supertonic-quantized`
|
| 47 |
+
|
| 48 |
+
The core of the benchmark is simply:
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
python py/example_onnx.py \
|
| 52 |
+
--onnx-dir onnx_fp16 \
|
| 53 |
+
--voice-style assets/voice_styles/M1.json \
|
| 54 |
+
--text "... your test text ..." \
|
| 55 |
+
--n-test 1 \
|
| 56 |
+
--save-dir results_fp16
|
| 57 |
+
|
| 58 |
+
python py/example_onnx.py \
|
| 59 |
+
--onnx-dir onnx_int8 \
|
| 60 |
+
--voice-style assets/voice_styles/M1.json \
|
| 61 |
+
--text "... the same test text ..." \
|
| 62 |
+
--n-test 1 \
|
| 63 |
+
--save-dir results_int8
|
| 64 |
+
````
|
| 65 |
+
|
| 66 |
+
Where `onnx_fp16/` and `onnx_int8/` contain drop-in copies of the original
|
| 67 |
+
Supertonic ONNX files, but converted/quantized to FP16 or INT8 respectively.
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## Model notes
|
| 72 |
+
|
| 73 |
+
* **FP16** models are converted from the original FP32 weights using float16 conversion.
|
| 74 |
+
* **INT8** models are dynamically quantized (MatMul/Gemm) using ONNX Runtime.
|
| 75 |
+
* The quantized models live in `{QUANT_REPO_ID}` and can be plugged into the
|
| 76 |
+
Supertonic pipeline via the `--onnx-dir` argument in `example_onnx.py`.
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## License
|
| 81 |
+
|
| 82 |
+
The original Supertonic code and models are licensed under their respective
|
| 83 |
+
licenses (MIT code + OpenRAIL-M model). This benchmark and quantized packaging
|
| 84 |
+
follow the same licensing terms.
|