Shadow0482 commited on
Commit
16dbb03
·
verified ·
1 Parent(s): faa0459

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Supertonic – FP16 vs INT8 Quantized Benchmark (by Shadow0482)
2
+
3
+ This README documents a simple benchmark comparing **FP16** and **INT8** quantized
4
+ versions of the [Supertonic](https://huggingface.co/Supertone/supertonic) TTS
5
+ pipeline using the quantized models hosted at:
6
+
7
+ - Quantized models repo: **Shadow0482/supertonic-quantized**
8
+ - This model card: **Shadow0482/supertonic-quantized**
9
+
10
+ All tests were run in Google Colab on CPU using the official `py/example_onnx.py`
11
+ script from the Supertonic GitHub repository.
12
+
13
+ ---
14
+
15
+ ## Test text
16
+
17
+ The same text was used for both FP16 and INT8 runs:
18
+
19
+ > Greetings! You are listening to your newly quantized model. I have been squished, squeezed, compressed, minimized, optimized, digitized, and lightly traumatized to save disk space. The testing framework automatically verifies my integrity, measures how much weight I lost, and checks if I can still talk without glitching into a robot dolphin. If you can hear this clearly, the quantization ritual was a complete success.
20
+
21
+ ---
22
+
23
+ ## Results
24
+
25
+ | Variant | Precision | ONNX directory | Time (s) | Output WAV | Status |
26
+ |--------:|-----------|-----------------------|---------:|----------------------------------|--------|
27
+ | FP16 | float16 | `onnx_fp16/` | 0.290 | `NONE` | FAILED |
28
+ | INT8 | int8 | `onnx_int8/` | 0.246 | `NONE` | FAILED |
29
+
30
+
31
+
32
+ > Note:
33
+ > - Exact times will vary depending on Colab hardware, runtime load, and ONNX Runtime version.
34
+ > - The goal of this benchmark is to confirm that both FP16 and INT8 quantized models
35
+ > load correctly and produce intelligible audio for the same input text.
36
+
37
+ ---
38
+
39
+ ## How to reproduce this benchmark
40
+
41
+ In a fresh Colab notebook:
42
+
43
+ 1. Install dependencies and run the benchmark cell (the one that created this README).
44
+ 2. Make sure you have write access to both:
45
+ - `Shadow0482/supertonic`
46
+ - `Shadow0482/supertonic-quantized`
47
+
48
+ The core of the benchmark is simply:
49
+
50
+ ```bash
51
+ python py/example_onnx.py \
52
+ --onnx-dir onnx_fp16 \
53
+ --voice-style assets/voice_styles/M1.json \
54
+ --text "... your test text ..." \
55
+ --n-test 1 \
56
+ --save-dir results_fp16
57
+
58
+ python py/example_onnx.py \
59
+ --onnx-dir onnx_int8 \
60
+ --voice-style assets/voice_styles/M1.json \
61
+ --text "... the same test text ..." \
62
+ --n-test 1 \
63
+ --save-dir results_int8
64
+ ````
65
+
66
+ Where `onnx_fp16/` and `onnx_int8/` contain drop-in copies of the original
67
+ Supertonic ONNX files, but converted/quantized to FP16 or INT8 respectively.
68
+
69
+ ---
70
+
71
+ ## Model notes
72
+
73
+ * **FP16** models are converted from the original FP32 weights using float16 conversion.
74
+ * **INT8** models are dynamically quantized (MatMul/Gemm) using ONNX Runtime.
75
+ * The quantized models live in `{QUANT_REPO_ID}` and can be plugged into the
76
+ Supertonic pipeline via the `--onnx-dir` argument in `example_onnx.py`.
77
+
78
+ ---
79
+
80
+ ## License
81
+
82
+ The original Supertonic code and models are licensed under their respective
83
+ licenses (MIT code + OpenRAIL-M model). This benchmark and quantized packaging
84
+ follow the same licensing terms.