Instructions to use JuzeZhang/ViBES-Face with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JuzeZhang/ViBES-Face with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("JuzeZhang/ViBES-Face", dtype="auto") - Notebooks
- Google Colab
- Kaggle
ViBES-Face
Pretrained face checkpoint for ViBES, a speech–language–behavior (SLB) model that generates synchronized 3D facial and body animation from conversational input.
ViBES uses a Mixture-of-Modality-Experts (MoME) architecture with two experts:
- Expert 0 — text/audio (frozen during training; exactly the GLM-4-Voice base)
- Expert 1 — motion (the trained face expert)
⚠️ This checkpoint stores only the motion expert (Expert 1)
Because Expert 0 is frozen and identical to the GLM-4-Voice base, shipping it in every checkpoint is redundant. This repo contains only the trained motion expert (~0.86 GB) instead of the full ~20 GB model. Expert 0 is reconstructed from the GLM-4-Voice base at load time and merged with this expert — the result is bit-for-bit identical to the original full checkpoint (verified: max abs diff 0.0 over all 284 Expert-0 tensors).
The Expert-1-only format is marked by expert_checkpoint.json; the ViBES loaders detect it
automatically.
Usage
# 1. Download the GLM-4-Voice base (provides the frozen Expert-0; ~18 GB)
huggingface-cli download THUDM/glm-4-voice-9b --local-dir ./model_files/glm-4-voice-9b
# 2. Download this checkpoint (the motion expert; ~0.86 GB)
huggingface-cli download JuzeZhang/ViBES-Face --local-dir ./ViBES-Face
# 3. Run inference — Expert-0 is rebuilt from the GLM base and merged automatically
python inference/inference_face.py \
--checkpoint ./ViBES-Face \
--glm_base_path ./model_files/glm-4-voice-9b \
--user_text "If you had a superpower for one day, what would you choose?"
--glm_base_path defaults to THUDM/glm-4-voice-9b (auto-downloaded via the HF cache), so step 1 is
optional if you are online. See the ViBES repo for full setup.
Files
| File | Description |
|---|---|
model.safetensors |
The motion expert (Expert 1) weights, bf16. |
expert_checkpoint.json |
Marker identifying this as an Expert-1-only checkpoint. |
Citation
If you use ViBES, please cite the paper (CVPR 2026). See the repository for the BibTeX entry.