cms42 commited on
Commit
6d9ffd2
·
verified ·
1 Parent(s): a697248

Docs: Update README to include benchmark results

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -85,6 +85,18 @@ python -m moss_tts_delay.llama_cpp \
85
 
86
  For full setup instructions (including building the C bridge, configuration options, and installation profiles), see the [llama.cpp Backend documentation](https://github.com/OpenMOSS/MOSS-TTS/blob/main/moss_tts_delay/llama_cpp/README.md).
87
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  ### Main Repositories
89
 
90
  | Repository | Description |
 
85
 
86
  For full setup instructions (including building the C bridge, configuration options, and installation profiles), see the [llama.cpp Backend documentation](https://github.com/OpenMOSS/MOSS-TTS/blob/main/moss_tts_delay/llama_cpp/README.md).
87
 
88
+ ### Quantization Benchmark
89
+
90
+ Quantization quality evaluated on [Seed-TTS-eval](https://github.com/BytedanceSpeech/seed-tts-eval) zero-shot benchmark. Baseline is the original HuggingFace model; GGUF variants use the llama.cpp backend with TensorRT audio tokenizer.
91
+
92
+ | Quantization | EN WER (%) ↓ | EN SIM (%) ↑ | ZH CER (%) ↓ | ZH SIM (%) ↑ |
93
+ |---|---:|---:|---:|---:|
94
+ | Baseline (HuggingFace) | 1.79 | 71.46 | 1.32 | 77.05 |
95
+ | Q8_0 | 3.21 | 68.61 | 1.56 | 76.03 |
96
+ | Q6_K | 3.11 | 68.77 | 1.44 | 76.06 |
97
+ | Q5_K_M | 2.95 | 68.55 | 1.50 | 75.96 |
98
+ | Q4_K_M | 2.83 | 68.15 | 1.58 | 75.71 |
99
+
100
  ### Main Repositories
101
 
102
  | Repository | Description |