How to use from the
Use from the
Transformers library
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("JuzeZhang/ViBES-Face", dtype="auto")
Quick Links

ViBES-Face

Pretrained face checkpoint for ViBES, a speech–language–behavior (SLB) model that generates synchronized 3D facial and body animation from conversational input.

ViBES uses a Mixture-of-Modality-Experts (MoME) architecture with two experts:

  • Expert 0 — text/audio (frozen during training; exactly the GLM-4-Voice base)
  • Expert 1 — motion (the trained face expert)

⚠️ This checkpoint stores only the motion expert (Expert 1)

Because Expert 0 is frozen and identical to the GLM-4-Voice base, shipping it in every checkpoint is redundant. This repo contains only the trained motion expert (~0.86 GB) instead of the full ~20 GB model. Expert 0 is reconstructed from the GLM-4-Voice base at load time and merged with this expert — the result is bit-for-bit identical to the original full checkpoint (verified: max abs diff 0.0 over all 284 Expert-0 tensors).

The Expert-1-only format is marked by expert_checkpoint.json; the ViBES loaders detect it automatically.

Usage

# 1. Download the GLM-4-Voice base (provides the frozen Expert-0; ~18 GB)
huggingface-cli download THUDM/glm-4-voice-9b --local-dir ./model_files/glm-4-voice-9b

# 2. Download this checkpoint (the motion expert; ~0.86 GB)
huggingface-cli download JuzeZhang/ViBES-Face --local-dir ./ViBES-Face

# 3. Run inference — Expert-0 is rebuilt from the GLM base and merged automatically
python inference/inference_face.py \
    --checkpoint ./ViBES-Face \
    --glm_base_path ./model_files/glm-4-voice-9b \
    --user_text "If you had a superpower for one day, what would you choose?"

--glm_base_path defaults to THUDM/glm-4-voice-9b (auto-downloaded via the HF cache), so step 1 is optional if you are online. See the ViBES repo for full setup.

Files

File Description
model.safetensors The motion expert (Expert 1) weights, bf16.
expert_checkpoint.json Marker identifying this as an Expert-1-only checkpoint.

Citation

If you use ViBES, please cite the paper (CVPR 2026). See the repository for the BibTeX entry.

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support