card: drop the 9B/0.5B variant table (lives in the GitHub repo)
Browse files
README.md
CHANGED
|
@@ -23,15 +23,8 @@ frozen speech **tokenizer** (Whisper-VQ, 12.5 Hz) and **decoder** (CosyVoice flo
|
|
| 23 |
are reused unchanged.
|
| 24 |
|
| 25 |
It is the **speech/text backbone** of [ViBES](https://github.com/Juzezhang/ViBES) (our
|
| 26 |
-
speech-language-behavior model)
|
| 27 |
-
|
| 28 |
-
| ViBES variant | Speech/text backbone (Expert-0) | Use |
|
| 29 |
-
|---|---|---|
|
| 30 |
-
| **ViBES (9B)** | GLM-4-Voice-9B | best quality |
|
| 31 |
-
| **ViBES (0.5B)** | **ViBES-Audio (this model)** | ~15× smaller backbone, low-latency / on-device |
|
| 32 |
-
|
| 33 |
-
The motion experts are released separately: [`ViBES-Face`](https://huggingface.co/JuzeZhang/ViBES-Face)
|
| 34 |
-
(and ViBES-Body).
|
| 35 |
|
| 36 |
## Model
|
| 37 |
|
|
|
|
| 23 |
are reused unchanged.
|
| 24 |
|
| 25 |
It is the **speech/text backbone** of [ViBES](https://github.com/Juzezhang/ViBES) (our
|
| 26 |
+
speech-language-behavior model) — a lightweight, low-latency alternative to the GLM-4-Voice-9B base.
|
| 27 |
+
The motion experts are released separately: [`ViBES-Face`](https://huggingface.co/JuzeZhang/ViBES-Face).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Model
|
| 30 |
|