Text-to-Speech
ONNX
Safetensors
tjadamlee commited on
Commit
189652e
Β·
verified Β·
1 Parent(s): 1e20370

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -17
README.md CHANGED
@@ -60,23 +60,24 @@
60
  - [x] Fastapi server and client
61
 
62
  ## Evaluation
63
- | Model | CER (%) ↓ (test-zh) | WER (%) ↓ (test-en) | CER (%) ↓ (test-hard) |
64
- |-----|------------------|------------------|------------------|
65
- | Human | 1.26 | 2.14 | - |
66
- | F5-TTS | 1.53 | 2.00 | 8.67 |
67
- | SparkTTS | 1.20 | 1.98 | - |
68
- | Seed-TTS | 1.12 | 2.25 | 7.59 |
69
- | CosyVoice2 | 1.45 | 2.57 | 6.83 |
70
- | FireRedTTS-2 | 1.14 | 1.95 | - |
71
- | IndexTTS2 | 1.01 | 1.52 | 7.12 |
72
- | VibeVoice | 1.16 | 3.04 | - |
73
- | HiggsAudio | 1.79 | 2.44 | - |
74
- | MiniMax-Speech | 0.83 | 1.65 | - |
75
- | VoxPCM | 0.93 | 1.85 | 8.87 |
76
- | GLM-TTS | 1.03 | - | - |
77
- | GLM-TTS_RL | 0.89 | - | - |
78
- | Fun-CosyVoice3-0.5B-2512 | 1.21 | 2.24 | 6.71 |
79
- | Fun-CosyVoice3-0.5B-2512_RL | 0.81 | 1.68 | 5.44 |
 
80
 
81
 
82
  ## Install
 
60
  - [x] Fastapi server and client
61
 
62
  ## Evaluation
63
+ | Model | Open-Source | Model Size | test-zh<br>CER (%) ↓ | test-zh<br>Speaker Similarity (%) ↑ | test-en<br>WER (%) ↓ | test-en<br>Speaker Similarity (%) ↑ | test-hard<br>CER (%) ↓ | test-hard<br>Speaker Similarity (%) |
64
+ | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
65
+ | Human | - | - | 1.26 | 75.5 | 2.14 | 73.4 | - | - |
66
+ | Seed-TTS | ❌ | - | 1.12 | 79.6 | 2.25 | 76.2 | 7.59 | 77.6 |
67
+ | MiniMax-Speech | ❌ | - | 0.83 | 78.3 | 1.65 | 69.2 | - | - |
68
+ | F5-TTS | βœ… | 0.3B | 1.52 | 74.1 | 2.00 | 64.7 | 8.67 | 71.3 |
69
+ | Spark TTS | βœ… | 0.5B | 1.2 | 66.0 | 1.98 | 57.3 | - | - |
70
+ | CosyVoice2 | βœ… | 0.5B | 1.45 | 75.7 | 2.57 | 65.9 | 6.83 | 72.4 |
71
+ | FireRedTTS 2 | βœ… | 1.5B | 1.14 | 73.2 | 1.95 | 66.5 | - | - |
72
+ | Index-TTS2 | βœ… | 1.5B | 1.03 | 76.5 | 2.23 | 70.6 | 7.12 | 75.5 |
73
+ | VibeVoice-1.5B | βœ… | 1.5B | 1.16 | 74.4 | 3.04 | 68.9 | - | - |
74
+ | VibeVoice-Realtime | βœ… | 0.5B | - | - | 2.05 | 63.3 | - | - |
75
+ | HiggsAudio-v2 | βœ… | 3B | 1.50 | 74.0 | 2.44 | 67.7 | - | - |
76
+ | VoxCPM | βœ… | 0.5B | 0.93 | 77.2 | 1.85 | 72.9 | 8.87 | 73.0 |
77
+ | GLM-TTS | βœ… | 1.5B | 1.03 | 76.1 | - | - | - | - |
78
+ | GLM-TTS RL | βœ… | 1.5B | 0.89 | 76.4 | - | - | - | - |
79
+ | Fun-CosyVoice3-0.5B-2512 | βœ… | 1.5B | 0.89 | 76.4 | - | - | - | - |
80
+
81
 
82
 
83
  ## Install