MotionLab - Unified Human Motion Generation and Editing
Text-to-motion baseline integrated into the hftrainer Model Zoo. The runtime is
self-contained under hftrainer.models.motion.motionlab.network and does not
import the original repository at inference time.
| Task | Text-to-Motion (T2M), motion generation / editing research stack |
| Bundle / Pipeline | MotionLabBundle / MotionLabPipeline |
| Processed HF artifact | ZeyuLing/hftrainer-motionlab-humanml3d |
| Motion representation | HumanML3D-263 (263-dim, 20 fps, 22 joints) |
| Architecture | RFMotion / MotionFlow Transformer with CLIP text conditioning |
| Paper | MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm, Guo et al., ICCV 2025 - arXiv:2502.02358 |
| Original code | https://github.com/Diouo/MotionLab |
Weights
Self-contained hftrainer artifact:
| Artifact | Location | Contents | Status |
|---|---|---|---|
| MotionLab HumanML3D | ZeyuLing/hftrainer-motionlab-humanml3d |
motionflow.ckpt + configs/ + Mean.npy / Std.npy + mean_motion.npy / std_motion.npy + model_index.json |
public Hub artifact |
| local mirror | checkpoints/baselines/motionlab |
same layout | optional local cache |
Use directly from the Hub:
from hftrainer.pipelines.motionlab import MotionLabPipeline
pipe = MotionLabPipeline.from_pretrained(
"ZeyuLing/hftrainer-motionlab-humanml3d",
device="cuda",
)
motions = pipe.infer_t2m(
["a person walks forward then sits down"],
[120],
) # list of (T, 263)
For a local mirror:
pipe = MotionLabPipeline.from_pretrained("checkpoints/baselines/motionlab", device="cuda")
Motion Representation
MotionLab natively generates HumanML3D-263 at 20 fps. For shared SMPL and MotionStreamer-272 evaluation, use the validated bridge:
HumanML3D-263 -> SMPL motion_135 via IK refine-80 -> MotionStreamer-272
The artifact contains both the HumanML3D denormalization statistics and MotionLab's internal motion statistics so the published pipeline does not depend on a separate dataset checkout.
HumanML3D Leaderboard Metrics
The row below uses the shared HumanML3D official-test caption protocol and the HML263 round-trip GT reference for SMPL-based evaluators.
| Evaluator | R1 up | R2 up | R3 up | FID down | MM down | Div up |
|---|---|---|---|---|---|---|
| MotionStreamer-272 | 0.6367 | 0.7882 | 0.8529 | 25.4469 | 17.9756 | 25.5355 |
| MotionCLIP-135 no-L2 | 0.4807 | 0.6457 | 0.7353 | 102.7770 | 41.5472 | 23.0179 |
Physical metrics:
| Slide down | Float down | Jitter down | Dynamic down |
|---|---|---|---|
| 2.4231 | 4.0795 | 5.8493 | 24.3519 |
Implementation Notes
- Artifact inference imports only
hftrainer.models.motion.motionlab.network. - Config targets are rewritten from the original
rfmotion.*namespace into the vendored hftrainer namespace before model construction. - The default inference stage is
demo, matching the validated qualitative HumanML3D T2M setting.